diff --git a/master/import-export/nebula-exchange/parameter-reference/ex-ug-parameter/index.html b/master/import-export/nebula-exchange/parameter-reference/ex-ug-parameter/index.html index 4e75f7de17..3ff6ccb091 100644 --- a/master/import-export/nebula-exchange/parameter-reference/ex-ug-parameter/index.html +++ b/master/import-export/nebula-exchange/parameter-reference/ex-ug-parameter/index.html @@ -9043,6 +9043,13 @@
tags.partition ≤ 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source.tags.filter
edges.partition ≤ 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source.edges.filter
Note
-This manual is revised on 2024-4-26, with GitHub commit c0abd6c57d.
+This manual is revised on 2024-4-26, with GitHub commit 924040ba96.
NebulaGraph is a distributed, scalable, and lightning-fast graph database. It is the optimal solution in the world capable of hosting graphs with dozens of billions of vertices (nodes) and trillions of edges (relationships) with millisecond latency.
Note
This manual is revised on 2024-4-26, with GitHub commit c0abd6c57d.
NebulaGraph is a distributed, scalable, and lightning-fast graph database. It is the optimal solution in the world capable of hosting graphs with dozens of billions of vertices (nodes) and trillions of edges (relationships) with millisecond latency.
"},{"location":"#getting_started","title":"Getting started","text":"Note
Additional information or operation-related notes.
Caution
May have adverse effects, such as causing performance degradation or triggering known minor problems.
Warning
May lead to serious issues, such as data loss or system crash.
Danger
May lead to extremely serious issues, such as system damage or information leakage.
Compatibility
The compatibility notes between nGQL and openCypher, or between the current version of nGQL and its prior ones.
Enterpriseonly
Differences between the NebulaGraph Community and Enterprise editions.
"},{"location":"#modify_errors","title":"Modify errors","text":"This NebulaGraph manual is written in the Markdown language. Users can click the pencil sign on the upper right side of each document title and modify errors.
"},{"location":"nebula-bench/","title":"NebulaGraph Bench","text":"NebulaGraph Bench is a performance test tool for NebulaGraph using the LDBC data set.
"},{"location":"nebula-bench/#scenario","title":"Scenario","text":"Release
"},{"location":"nebula-bench/#test_process","title":"Test process","text":"For detailed usage instructions, see NebulaGraph Bench.
"},{"location":"nebula-console/","title":"NebulaGraph Console","text":"NebulaGraph Console is a native CLI client for NebulaGraph. It can be used to connect a NebulaGraph cluster and execute queries. It also supports special commands to manage parameters, export query results, import test datasets, etc.
"},{"location":"nebula-console/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"nebula-console/#obtain_nebulagraph_console","title":"Obtain NebulaGraph Console","text":"You can obtain NebulaGraph Console in the following ways:
To connect to NebulaGraph with the nebula-console
file, use the following syntax:
<path_of_console> -addr <ip> -port <port> -u <username> -p <password>\n
path_of_console
indicates the storage path of the NebulaGraph Console binary file.For example:
Direct link to NebulaGraph
./nebula-console -addr 192.168.8.100 -port 9669 -u root -p nebula\n
Enable SSL encryption and require two-way authentication
./nebula-console -addr 192.168.8.100 -port 9669 -u root -p nebula -enable_ssl -ssl_root_ca_path /home/xxx/cert/root.crt -ssl_cert_path /home/xxx/cert/client.crt -ssl_private_key_path /home/xxx/cert/client.key\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP or hostname of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. -ssl_insecure_skip_verify
Specifies whether the client skips verifying the server's certificate chain and hostname. The default is false
. If set to true
, any certificate chain and hostname provided by the server is accepted. For information on more parameters, see the project repository.
"},{"location":"nebula-console/#manage_parameters","title":"Manage parameters","text":"You can save parameters for parameterized queries.
Note
SAMPLE
clauses.The command to save a parameter is as follows:
nebula> :param <param_name> => <param_value>;\n
The example is as follows:
nebula> :param p1 => \"Tim Duncan\";\nnebula> MATCH (v:player{name:$p1})-[:follow]->(n) RETURN v,n;\n+----------------------------------------------------+-------------------------------------------------------+\n| v | n |\n+----------------------------------------------------+-------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+----------------------------------------------------+-------------------------------------------------------+\nnebula> :param p2 => {\"a\":3,\"b\":false,\"c\":\"Tim Duncan\"};\nnebula> RETURN $p2.b AS b;\n+-------+\n| b |\n+-------+\n| false |\n+-------+\n
The command to view the saved parameters is as follows:
nebula> :params;\n
The command to view the specified parameters is as follows:
nebula> :params <param_name>;\n
The command to delete a specified parameter is as follows:
nebula> :param <param_name> =>;\n
Export query results, which can be saved as a CSV file, DOT file, and a format of Profile or Explain.
Note
pwd
shows.The command to export a csv file is as follows:
nebula> :CSV <file_name.csv>;\n
The command to export a DOT file is as follows:
nebula> :dot <file_name.dot>\n
The example is as follows:
nebula> :dot a.dot\nnebula> PROFILE FORMAT=\"dot\" GO FROM \"player100\" OVER follow;\n
The command to export a PROFILE or EXPLAIN format is as follows:
nebula> :profile <file_name>;\n
or nebula> :explain <file_name>;\n
Note
The text file output by the above command is the preferred way to report issues in GitHub and execution plans in forums, and for graph query tuning because it has more information and is more readable than a screenshot or CSV file in Studio.
The example is as follows:
nebula> :profile profile.log\nnebula> PROFILE GO FROM \"player102\" OVER serve YIELD dst(edge);\nnebula> :profile profile.dot\nnebula> PROFILE FORMAT=\"dot\" GO FROM \"player102\" OVER serve YIELD dst(edge);\nnebula> :explain explain.log\nnebula> EXPLAIN GO FROM \"player102\" OVER serve YIELD dst(edge);\n
The testing dataset is named basketballplayer
. To view details about the schema and data, use the corresponding SHOW
command.
The command to import a testing dataset is as follows:
nebula> :play basketballplayer\n
"},{"location":"nebula-console/#run_a_command_multiple_times","title":"Run a command multiple times","text":"To run a command multiple times, use the following command:
nebula> :repeat N\n
The example is as follows:
nebula> :repeat 3\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 2602/3214 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 583/849 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 496/671 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\nExecuted 3 times, (total time spent 3681/4734 us), (average time spent 1227/1578 us)\n
"},{"location":"nebula-console/#sleep","title":"Sleep","text":"This command will make NebulaGraph Console sleep for N seconds. The schema is altered in an async way and takes effect in the next heartbeat cycle. Therefore, this command is usually used when altering schema. The command is as follows:
nebula> :sleep N\n
"},{"location":"nebula-console/#disconnect_nebulagraph_console_from_nebulagraph","title":"Disconnect NebulaGraph Console from NebulaGraph","text":"You can use :EXIT
or :QUIT
to disconnect from NebulaGraph. For convenience, NebulaGraph Console supports using these commands in lower case without the colon (\":\"), such as quit
.
The example is as follows:
nebula> :QUIT\n\nBye root!\n
"},{"location":"1.introduction/1.what-is-nebula-graph/","title":"What is NebulaGraph","text":"NebulaGraph is an open-source, distributed, easily scalable, and native graph database. It is capable of hosting graphs with hundreds of billions of vertices and trillions of edges, and serving queries with millisecond-latency.
"},{"location":"1.introduction/1.what-is-nebula-graph/#what_is_a_graph_database","title":"What is a graph database","text":"A graph database, such as NebulaGraph, is a database that specializes in storing vast graph networks and retrieving information from them. It efficiently stores data as vertices (nodes) and edges (relationships) in labeled property graphs. Properties can be attached to both vertices and edges. Each vertex can have one or multiple tags (labels).
Graph databases are well suited for storing most kinds of data models abstracted from reality. Things are connected in almost all fields in the world. Modeling systems like relational databases extract the relationships between entities and squeeze them into table columns alone, with their types and properties stored in other columns or even other tables. This makes data management time-consuming and cost-ineffective.
NebulaGraph, as a typical native graph database, allows you to store the rich relationships as edges with edge types and properties directly attached to them.
"},{"location":"1.introduction/1.what-is-nebula-graph/#advantages_of_nebulagraph","title":"Advantages of NebulaGraph","text":""},{"location":"1.introduction/1.what-is-nebula-graph/#open_source","title":"Open source","text":"NebulaGraph is open under the Apache 2.0 License. More and more people such as database developers, data scientists, security experts, and algorithm engineers are participating in the designing and development of NebulaGraph. To join the opening of source code and ideas, surf the NebulaGraph GitHub page.
"},{"location":"1.introduction/1.what-is-nebula-graph/#outstanding_performance","title":"Outstanding performance","text":"Written in C++ and born for graphs, NebulaGraph handles graph queries in milliseconds. Among most databases, NebulaGraph shows superior performance in providing graph data services. The larger the data size, the greater the superiority of NebulaGraph.For more information, see NebulaGraph benchmarking.
"},{"location":"1.introduction/1.what-is-nebula-graph/#high_scalability","title":"High scalability","text":"NebulaGraph is designed in a shared-nothing architecture and supports scaling in and out without interrupting the database service.
"},{"location":"1.introduction/1.what-is-nebula-graph/#developer_friendly","title":"Developer friendly","text":"NebulaGraph supports clients in popular programming languages like Java, Python, C++, and Go, and more are under development. For more information, see NebulaGraph clients.
"},{"location":"1.introduction/1.what-is-nebula-graph/#reliable_access_control","title":"Reliable access control","text":"NebulaGraph supports strict role-based access control and external authentication servers such as LDAP (Lightweight Directory Access Protocol) servers to enhance data security. For more information, see Authentication and authorization.
"},{"location":"1.introduction/1.what-is-nebula-graph/#diversified_ecosystem","title":"Diversified ecosystem","text":"More and more native tools of NebulaGraph have been released, such as NebulaGraph Studio, NebulaGraph Console, and NebulaGraph Exchange. For more ecosystem tools, see Ecosystem tools overview.
Besides, NebulaGraph has the ability to be integrated with many cutting-edge technologies, such as Spark, Flink, and HBase, for the purpose of mutual strengthening in a world of increasing challenges and chances.
"},{"location":"1.introduction/1.what-is-nebula-graph/#opencypher-compatible_query_language","title":"OpenCypher-compatible query language","text":"The native NebulaGraph Query Language, also known as nGQL, is a declarative, openCypher-compatible textual query language. It is easy to understand and easy to use. For more information, see nGQL guide.
"},{"location":"1.introduction/1.what-is-nebula-graph/#future-oriented_hardware_with_balanced_reading_and_writing","title":"Future-oriented hardware with balanced reading and writing","text":"Solid-state drives have extremely high performance and they are getting cheaper. NebulaGraph is a product based on SSD. Compared with products based on HDD and large memory, it is more suitable for future hardware trends and easier to achieve balanced reading and writing.
"},{"location":"1.introduction/1.what-is-nebula-graph/#easy_data_modeling_and_high_flexibility","title":"Easy data modeling and high flexibility","text":"You can easily model the connected data into NebulaGraph for your business without forcing them into a structure such as a relational table, and properties can be added, updated, and deleted freely. For more information, see Data modeling.
"},{"location":"1.introduction/1.what-is-nebula-graph/#high_popularity","title":"High popularity","text":"NebulaGraph is being used by tech leaders such as Tencent, Vivo, Meituan, and JD Digits. For more information, visit the NebulaGraph official website.
"},{"location":"1.introduction/1.what-is-nebula-graph/#use_cases","title":"Use cases","text":"NebulaGraph can be used to support various graph-based scenarios. To spare the time spent on pushing the kinds of data mentioned in this section into relational databases and on bothering with join queries, use NebulaGraph.
"},{"location":"1.introduction/1.what-is-nebula-graph/#fraud_detection","title":"Fraud detection","text":"Financial institutions have to traverse countless transactions to piece together potential crimes and understand how combinations of transactions and devices might be related to a single fraud scheme. This kind of scenario can be modeled in graphs, and with the help of NebulaGraph, fraud rings and other sophisticated scams can be easily detected.
"},{"location":"1.introduction/1.what-is-nebula-graph/#real-time_recommendation","title":"Real-time recommendation","text":"NebulaGraph offers the ability to instantly process the real-time information produced by a visitor and make accurate recommendations on articles, videos, products, and services.
"},{"location":"1.introduction/1.what-is-nebula-graph/#intelligent_question-answer_system","title":"Intelligent question-answer system","text":"Natural languages can be transformed into knowledge graphs and stored in NebulaGraph. A question organized in a natural language can be resolved by a semantic parser in an intelligent question-answer system and re-organized. Then, possible answers to the question can be retrieved from the knowledge graph and provided to the one who asked the question.
"},{"location":"1.introduction/1.what-is-nebula-graph/#social_networking","title":"Social networking","text":"Information on people and their relationships is typical graph data. NebulaGraph can easily handle the social networking information of billions of people and trillions of relationships, and provide lightning-fast queries for friend recommendations and job promotions in the case of massive concurrency.
"},{"location":"1.introduction/1.what-is-nebula-graph/#related_links","title":"Related links","text":"In graph theory, a path in a graph is a finite or infinite sequence of edges which joins a sequence of vertices. Paths are fundamental concepts of graph theory.
Paths can be categorized into 3 types: walk
, trail
, and path
. For more information, see Wikipedia.
The following figure is an example for a brief introduction.
"},{"location":"1.introduction/2.1.path/#walk","title":"Walk","text":"A walk
is a finite or infinite sequence of edges. Both vertices and edges can be repeatedly visited in graph traversal.
In the above figure C, D, and E form a cycle. So, this figure contains infinite paths, such as A->B->C->D->E
, A->B->C->D->E->C
, and A->B->C->D->E->C->D
.
Note
GO
statements use walk
.
A trail
is a finite sequence of edges. Only vertices can be repeatedly visited in graph traversal. The Seven Bridges of K\u00f6nigsberg is a typical trail
.
In the above figure, edges cannot be repeatedly visited. So, this figure contains finite paths. The longest path in this figure consists of 5 edges: A->B->C->D->E->C
.
Note
MATCH
, FIND PATH
, and GET SUBGRAPH
statements use trail
.
There are two special cases of trail, cycle
and circuit
. The following figure is an example for a brief introduction.
cycle
A cycle
refers to a closed trail
. Only the terminal vertices can be repeatedly visited. The longest path in this figure consists of 3 edges: A->B->C->A
or C->D->E->C
.
circuit
A circuit
refers to a closed trail
. Edges cannot be repeatedly visited in graph traversal. Apart from the terminal vertices, other vertices can also be repeatedly visited. The longest path in this figure: A->B->C->D->E->C->A
.
A path
is a finite sequence of edges. Neither vertices nor edges can be repeatedly visited in graph traversal.
So, the above figure contains finite paths. The longest path in this figure consists of 4 edges: A->B->C->D->E
.
A data model is a model that organizes data and specifies how they are related to one another. This topic describes the Nebula\u00a0Graph data model and provides suggestions for data modeling with NebulaGraph.
"},{"location":"1.introduction/2.data-model/#data_structures","title":"Data structures","text":"NebulaGraph data model uses six data structures to store data. They are graph spaces, vertices, edges, tags, edge types and properties.
In NebulaGraph, vertices are identified with vertex identifiers (i.e. VID
). The VID
must be unique in the same graph space. VID should be int64, or fixed_string(N).
Compatibility
In NebulaGraph 2.x a vertex must have at least one tag. And in NebulaGraph master, a tag is not required for a vertex.
->
identifies the directions of edges. Edges can be traversed in either direction.<a source vertex, an edge type, a rank value, and a destination vertex>
. Edges have no EID.Note
Tags and Edge types are similar to \"vertex tables\" and \"edge tables\" in the relational databases.
"},{"location":"1.introduction/2.data-model/#directed_property_graph","title":"Directed property graph","text":"NebulaGraph stores data in directed property graphs. A directed property graph has a set of vertices connected by directed edges. Both vertices and edges can have properties. A directed property graph is represented as:
G = < V, E, PV, PE >
The following table is an example of the structure of the basketball player dataset. We have two types of vertices, that is player and team, and two types of edges, that is serve and follow.
Element Name Property name (Data type) Description Tag player name (string) age (int) Represents players in the team. The propertiesname
and age
indicate the name and age. Tag team name (string) Represents the teams. The property name
indicates the team name. Edge type serve start_year (int) end_year (int) Represents the action of a player serving a team. The action links the player to the team, and the direction is from the player to the team.The properties start_year
and end_year
indicate the start year and end year of the service respectively. Edge type follow degree (int) Represents the action of a player following another player on Twitter. The action links one player to the other player, and the direction is from one player to the other player.The property degree
indicates the rating on how well the follower liked the followee. Note
NebulaGraph supports only directed edges.
Compatibility
NebulaGraph master allows dangling edges. Therefore, when adding or deleting, you need to ensure the corresponding source vertex and destination vertex of an edge exist. For details, see INSERT VERTEX, DELETE VERTEX, INSERT EDGE, and DELETE EDGE.
The MERGE statement in openCypher is not supported.
"},{"location":"1.introduction/3.vid/","title":"VID","text":"In a graph space, a vertex is uniquely identified by its ID, which is called a VID or a Vertex ID.
"},{"location":"1.introduction/3.vid/#features","title":"Features","text":"FIXED_STRING(<N>)
or INT64
. One graph space can only select one VID type.Vertices with the same VID will be identified as the same one. For example:
INSERT
statements (neither uses a parameter of IF NOT EXISTS
) with the same VID and tag are operated at the same time, the latter INSERT
will overwrite the former.INSERT
statements with the same VID but different tags, like TAG A
and TAG B
, are operated at the same time, the operation of Tag A
will not affect Tag B
.INT64
while NebulaGraph 2.x supports INT64
and FIXED_STRING(<N>)
. In CREATE SPACE
, VID types can be set via vid_type
.id()
function can be used to specify or locate a VID.LOOKUP
or MATCH
statements can be used to find a VID via property index.DELETE xxx WHERE id(xxx) == \"player100\"
or GO FROM \"player100\"
. Finding VIDs via properties and then operating the graph will cause poor performance, such as LOOKUP | GO FROM $-.ids
, which will run both LOOKUP
and |
one more time.VIDs can be generated via applications. Here are some tips:
N
of FIXED_STRING(<N>)
too much. Otherwise, it will occupy a lot of memory and hard disks, and slow down performance. Generate VIDs via BASE64, MD5, hash by encoding and splicing.The data type of a VID must be defined when you create the graph space. Once defined, it cannot be modified.
A VID is set when you insert a vertex and cannot be modified.
"},{"location":"1.introduction/3.vid/#query_start_vid_and_global_scan","title":"Querystart vid
and global scan","text":"In most cases, the execution plan of query statements in NebulaGraph (MATCH
, GO
, and LOOKUP
) must query the start vid
in a certain way.
There are only two ways to locate start vid
:
For example, GO FROM \"player100\" OVER
explicitly indicates in the statement that start vid
is \"player100\".
For example, LOOKUP ON player WHERE player.name == \"Tony Parker\"
or MATCH (v:player {name:\"Tony Parker\"})
locates start vid
by the index of the property player.name
.
NebulaGraph consists of three services: the Graph Service, the Storage Service, and the Meta Service. It applies the separation of storage and computing architecture.
Each service has its executable binaries and processes launched from the binaries. Users can deploy a NebulaGraph cluster on a single machine or multiple machines using these binaries.
The following figure shows the architecture of a typical NebulaGraph cluster.
"},{"location":"1.introduction/3.nebula-graph-architecture/1.architecture-overview/#the_meta_service","title":"The Meta Service","text":"The Meta Service in the NebulaGraph architecture is run by the nebula-metad processes. It is responsible for metadata management, such as schema operations, cluster administration, and user privilege management.
For details on the Meta Service, see Meta Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/1.architecture-overview/#the_graph_service_and_the_storage_service","title":"The Graph Service and the Storage Service","text":"NebulaGraph applies the separation of storage and computing architecture. The Graph Service is responsible for querying. The Storage Service is responsible for storage. They are run by different processes, i.e., nebula-graphd and nebula-storaged. The benefits of the separation of storage and computing architecture are as follows:
The separated structure makes both the Graph Service and the Storage Service flexible and easy to scale in or out.
If part of the Graph Service fails, the data stored by the Storage Service suffers no loss. And if the rest part of the Graph Service is still able to serve the clients, service recovery can be performed quickly, even unfelt by the users.
The separation of storage and computing architecture provides a higher resource utilization rate, and it enables clients to manage the cost flexibly according to business demands.
With the ability to run separately, the Graph Service may work with multiple types of storage engines, and the Storage Service may also serve more types of computing engines.
For details on the Graph Service and the Storage Service, see Graph Service and Storage Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/","title":"Meta Service","text":"This topic introduces the architecture and functions of the Meta Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#the_architecture_of_the_meta_service","title":"The architecture of the Meta Service","text":"The architecture of the Meta Service is as follows:
The Meta Service is run by nebula-metad processes. Users can deploy nebula-metad processes according to the scenario:
All the nebula-metad processes form a Raft-based cluster, with one process as the leader and the others as the followers.
The leader is elected by the majorities and only the leader can provide service to the clients or other components of NebulaGraph. The followers will be run in a standby way and each has a data replication of the leader. Once the leader fails, one of the followers will be elected as the new leader.
Note
The data of the leader and the followers will keep consistent through Raft. Thus the breakdown and election of the leader will not cause data inconsistency. For more information on Raft, see Storage service architecture.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#functions_of_the_meta_service","title":"Functions of the Meta Service","text":""},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_user_accounts","title":"Manages user accounts","text":"The Meta Service stores the information of user accounts and the privileges granted to the accounts. When the clients send queries to the Meta Service through an account, the Meta Service checks the account information and whether the account has the right privileges to execute the queries or not.
For more information on NebulaGraph access control, see Authentication.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_partitions","title":"Manages partitions","text":"The Meta Service stores and manages the locations of the storage partitions and helps balance the partitions.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_graph_spaces","title":"Manages graph spaces","text":"NebulaGraph supports multiple graph spaces. Data stored in different graph spaces are securely isolated. The Meta Service stores the metadata of all graph spaces and tracks the changes of them, such as adding or dropping a graph space.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_schema_information","title":"Manages schema information","text":"NebulaGraph is a strong-typed graph database. Its schema contains tags (i.e., the vertex types), edge types, tag properties, and edge type properties.
The Meta Service stores the schema information. Besides, it performs the addition, modification, and deletion of the schema, and logs the versions of them.
For more information on NebulaGraph schema, see Data model.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_ttl_information","title":"Manages TTL information","text":"The Meta Service stores the definition of TTL (Time to Live) options which are used to control data expiration. The Storage Service takes care of the expiring and evicting processes. For more information, see TTL.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_jobs","title":"Manages jobs","text":"The Job Management module in the Meta Service is responsible for the creation, queuing, querying, and deletion of jobs.
"},{"location":"1.introduction/3.nebula-graph-architecture/3.graph-service/","title":"Graph Service","text":"The Graph Service is used to process the query. It has four submodules: Parser, Validator, Planner, and Executor. This topic will describe the Graph Service accordingly.
"},{"location":"1.introduction/3.nebula-graph-architecture/3.graph-service/#the_architecture_of_the_graph_service","title":"The architecture of the Graph Service","text":"After a query is sent to the Graph Service, it will be processed by the following four submodules:
Parser: Performs lexical analysis and syntax analysis.
Validator: Validates the statements.
Planner: Generates and optimizes the execution plans.
Executor: Executes the plans with operators.
After receiving a request, the statements will be parsed by Parser composed of Flex (lexical analysis tool) and Bison (syntax analysis tool), and its corresponding AST will be generated. Statements will be directly intercepted in this stage because of their invalid syntax.
For example, the structure of the AST of GO FROM \"Tim\" OVER like WHERE properties(edge).likeness > 8.0 YIELD dst(edge)
is shown in the following figure.
Validator performs a series of validations on the AST. It mainly works on these tasks:
Validator will validate whether the metadata is correct or not.
When parsing the OVER
, WHERE
, and YIELD
clauses, Validator looks up the Schema and verifies whether the edge type and tag data exist or not. For an INSERT
statement, Validator verifies whether the types of the inserted data are the same as the ones defined in the Schema.
Validator will verify whether the cited variable exists or not, or whether the cited property is variable or not.
For composite statements, like $var = GO FROM \"Tim\" OVER like YIELD dst(edge) AS ID; GO FROM $var.ID OVER serve YIELD dst(edge)
, Validator verifies first to see if var
is defined, and then to check if the ID
property is attached to the var
variable.
Validator infers what type the result of an expression is and verifies the type against the specified clause.
For example, the WHERE
clause requires the result to be a bool
value, a NULL
value, or empty
.
*
Validator needs to verify all the Schema that involves *
when verifying the clause if there is a *
in the statement.
Take a statement like GO FROM \"Tim\" OVER * YIELD dst(edge), properties(edge).likeness, dst(edge)
as an example. When verifying the OVER
clause, Validator needs to verify all the edge types. If the edge type includes like
and serve
, the statement would be GO FROM \"Tim\" OVER like,serve YIELD dst(edge), properties(edge).likeness, dst(edge)
.
Validator will check the consistency of the clauses before and after the |
.
In the statement GO FROM \"Tim\" OVER like YIELD dst(edge) AS ID | GO FROM $-.ID OVER serve YIELD dst(edge)
, Validator will verify whether $-.ID
is defined in the clause before the |
.
When the validation succeeds, an execution plan will be generated. Its data structure will be stored in the src/planner
directory.
In the nebula-graphd.conf
file, when enable_optimizer
is set to be false
, Planner will not optimize the execution plans generated by Validator. It will be executed by Executor directly.
In the nebula-graphd.conf
file, when enable_optimizer
is set to be true
, Planner will optimize the execution plans generated by Validator. The structure is as follows.
In the execution plan on the right side of the preceding figure, each node directly depends on other nodes. For example, the root node Project
depends on the Filter
node, the Filter
node depends on the GetNeighbor
node, and so on, up to the leaf node Start
. Then the execution plan is (not truly) executed.
During this stage, every node has its input and output variables, which are stored in a hash table. The execution plan is not truly executed, so the value of each key in the associated hash table is empty (except for the Start
node, where the input variables hold the starting data), and the hash table is defined in src/context/ExecutionContext.cpp
under the nebula-graph
repository.
For example, if the hash table is named as ResultMap
when creating the Filter
node, users can determine that the node takes data from ResultMap[\"GN1\"]
, then puts the result into ResultMap[\"Filter2\"]
, and so on. All these work as the input and output of each node.
The optimization rules that Planner has implemented so far are considered RBO (Rule-Based Optimization), namely the pre-defined optimization rules. The CBO (Cost-Based Optimization) feature is under development. The optimized code is in the src/optimizer/
directory under the nebula-graph
repository.
RBO is a \u201cbottom-up\u201d exploration process. For each rule, the root node of the execution plan (in this case, the Project
node) is the entry point, and step by step along with the node dependencies, it reaches the node at the bottom to see if it matches the rule.
As shown in the preceding figure, when the Filter
node is explored, it is found that its children node is GetNeighbors
, which matches successfully with the pre-defined rules, so a transformation is initiated to integrate the Filter
node into the GetNeighbors
node, the Filter
node is removed, and then the process continues to the next rule. Therefore, when the GetNeighbor
operator calls interfaces of the Storage layer to get the neighboring edges of a vertex during the execution stage, the Storage layer will directly filter out the unqualified edges internally. Such optimization greatly reduces the amount of data transfer, which is commonly known as filter pushdown.
The Executor module consists of Scheduler and Executor. The Scheduler generates the corresponding execution operators against the execution plan, starting from the leaf nodes and ending at the root node. The structure is as follows.
Each node of the execution plan has one execution operator node, whose input and output have been determined in the execution plan. Each operator only needs to get the values for the input variables, compute them, and finally put the results into the corresponding output variables. Therefore, it is only necessary to execute step by step from Start
, and the result of the last operator is returned to the user as the final result.
The source code hierarchy under the nebula-graph repository is as follows.
|--src\n |--graph\n |--context //contexts for validation and execution\n |--executor //execution operators\n |--gc //garbage collector\n |--optimizer //optimization rules\n |--planner //structure of the execution plans\n |--scheduler //scheduler\n |--service //external service management\n |--session //session management\n |--stats //monitoring metrics\n |--util //basic components\n |--validator //validation of the statements\n |--visitor //visitor expression\n
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/","title":"Storage Service","text":"The persistent data of NebulaGraph have two parts. One is the Meta Service that stores the meta-related data.
The other is the Storage Service that stores the data, which is run by the nebula-storaged process. This topic will describe the architecture of the Storage Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#advantages","title":"Advantages","text":"The Storage Service is run by the nebula-storaged process. Users can deploy nebula-storaged processes on different occasions. For example, users can deploy 1 nebula-storaged process in a test environment and deploy 3 nebula-storaged processes in a production environment.
All the nebula-storaged processes consist of a Raft-based cluster. There are three layers in the Storage Service:
Storage interface
The top layer is the storage interface. It defines a set of APIs that are related to the graph concepts. These API requests will be translated into a set of KV operations targeting the corresponding Partition. For example:
getNeighbors
: queries the in-edge or out-edge of a set of vertices, returns the edges and the corresponding properties, and supports conditional filtering.insert vertex/edge
: inserts a vertex or edge and its properties.getProps
: gets the properties of a vertex or an edge.It is this layer that makes the Storage Service a real graph storage. Otherwise, it is just a KV storage.
Consensus
Below the storage interface is the consensus layer that implements Multi Group Raft, which ensures the strong consistency and high availability of the Storage Service.
Store engine
The bottom layer is the local storage engine library, providing operations like get
, put
, and scan
on local disks. The related interfaces are stored in KVStore.h
and KVEngine.h
files. You can develop your own local store plugins based on your needs.
The following will describe some features of the Storage Service based on the above architecture.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#storage_writing_process","title":"Storage writing process","text":""},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#kvstore","title":"KVStore","text":"NebulaGraph develops and customizes its built-in KVStore for the following reasons.
Therefore, NebulaGraph develops its own KVStore with RocksDB as the local storage engine. The advantages are as follows.
The Meta Service manages all the Storage servers. All the partition distribution data and current machine status can be found in the meta service. Accordingly, users can execute a manual load balancing plan in meta service.
Note
NebulaGraph does not support auto load balancing because auto data transfer will affect online business.
Graphs consist of vertices and edges. NebulaGraph uses key-value pairs to store vertices, edges, and their properties. Vertices and edges are stored in keys and their properties are stored in values. Such structure enables efficient property filtering.
The storage structure of vertices
Different from NebulaGraph version 2.x, version 3.x added a new key for each vertex. Compared to the old key that still exists, the new key has no TagID
field and no value. Vertices in NebulaGraph can now live without tags owing to the new key.
Type
One byte, used to indicate the key type. PartID
Three bytes, used to indicate the sharding partition and to scan the partition data based on the prefix when re-balancing the partition. VertexID
The vertex ID. For an integer VertexID, it occupies eight bytes. However, for a string VertexID, it is changed to fixed_string
of a fixed length which needs to be specified by users when they create the space. TagID
Four bytes, used to indicate the tags that vertex relate with. SerializedValue
The serialized value of the key. It stores the property information of the vertex. Type
One byte, used to indicate the key type. PartID
Three bytes, used to indicate the partition ID. This field can be used to scan the partition data based on the prefix when re-balancing the partition. VertexID
Used to indicate vertex ID. The former VID refers to the source VID in the outgoing edge and the dest VID in the incoming edge, while the latter VID refers to the dest VID in the outgoing edge and the source VID in the incoming edge. Edge Type
Four bytes, used to indicate the edge type. Greater than zero indicates out-edge, less than zero means in-edge. Rank
Eight bytes, used to indicate multiple edges in one edge type. Users can set the field based on needs and store weight, such as transaction time and transaction number. PlaceHolder
One byte. Reserved. SerializedValue
The serialized value of the key. It stores the property information of the edge. NebulaGraph uses strong-typed Schema.
NebulaGraph will store the properties of vertex and edges in order after encoding them. Since the length of fixed-length properties is fixed, queries can be made in no time according to offset. Before decoding, NebulaGraph needs to get (and cache) the schema information in the Meta Service. In addition, when encoding properties, NebulaGraph will add the corresponding schema version to support online schema change.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#data_partitioning","title":"Data partitioning","text":"Since in an ultra-large-scale relational network, vertices can be as many as tens to hundreds of billions, and edges are even more than trillions. Even if only vertices and edges are stored, the storage capacity of both exceeds that of ordinary servers. Therefore, NebulaGraph uses hash to shard the graph elements and store them in different partitions.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#edge_partitioning_and_storage_amplification","title":"Edge partitioning and storage amplification","text":"In NebulaGraph, an edge corresponds to two key-value pairs on the hard disk. When there are lots of edges and each has many properties, storage amplification will be obvious. The storage format of edges is shown in the figure below.
In this example, SrcVertex connects DstVertex via EdgeA, forming the path of (SrcVertex)-[EdgeA]->(DstVertex)
. SrcVertex, DstVertex, and EdgeA will all be stored in Partition x and Partition y as four key-value pairs in the storage layer. Details are as follows:
EdgeA_Out and EdgeA_In are stored in storage layer with opposite directions, constituting EdgeA logically. EdgeA_Out is used for traversal requests starting from SrcVertex, such as (a)-[]->()
; EdgeA_In is used for traversal requests starting from DstVertex, such as ()-[]->(a)
.
Like EdgeA_Out and EdgeA_In, NebulaGraph redundantly stores the information of each edge, which doubles the actual capacities needed for edge storage. The key corresponding to the edge occupies a small hard disk space, but the space occupied by Value is proportional to the length and amount of the property value. Therefore, it will occupy a relatively large hard disk space if the property value of the edge is large or there are many edge property values.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#partition_algorithm","title":"Partition algorithm","text":"NebulaGraph uses a static Hash strategy to shard data through a modulo operation on vertex ID. All the out-keys, in-keys, and tag data will be placed in the same partition. In this way, query efficiency is increased dramatically.
Note
The number of partitions needs to be determined when users are creating a graph space since it cannot be changed afterward. Users are supposed to take into consideration the demands of future business when setting it.
When inserting into NebulaGraph, vertices and edges are distributed across different partitions. And the partitions are located on different machines. The number of partitions is set in the CREATE SPACE statement and cannot be changed afterward.
If certain vertices need to be placed on the same partition (i.e., on the same machine), see Formula/code.
The following code will briefly describe the relationship between VID and partition.
// If VertexID occupies 8 bytes, it will be stored in int64 to be compatible with the version 1.0.\nuint64_t vid = 0;\nif (id.size() == 8) {\n memcpy(static_cast<void*>(&vid), id.data(), 8);\n} else {\n MurmurHash2 hash;\n vid = hash(id.data());\n}\nPartitionID pId = vid % numParts + 1;\n
Roughly speaking, after hashing a fixed string to int64, (the hashing of int64 is the number itself), do modulo, and then plus one, namely:
pId = vid % numParts + 1;\n
Parameters and descriptions of the preceding formula are as follows:
Parameter Description%
The modulo operation. numParts
The number of partitions for the graph space where the VID
is located, namely the value of partition_num
in the CREATE SPACE statement. pId
The ID for the partition where the VID
is located. Suppose there are 100 partitions, the vertices with VID
1, 101, and 1001 will be stored on the same partition. But, the mapping between the partition ID and the machine address is random. Therefore, we cannot assume that any two partitions are located on the same machine.
In a distributed system, one data usually has multiple replicas so that the system can still run normally even if a few copies fail. It requires certain technical means to ensure consistency between replicas.
Basic principle: Raft is designed to ensure consistency between replicas. Raft uses election between replicas, and the (candidate) replica that wins more than half of the votes will become the Leader, providing external services on behalf of all replicas. The rest Followers will play backups. When the Leader fails (due to communication failure, operation and maintenance commands, etc.), the rest Followers will conduct a new round of elections and vote for a new Leader. The Leader and Followers will detect each other's survival through heartbeats and write them to the hard disk in Raft-wal mode. Replicas that do not respond to more than multiple heartbeats will be considered faulty.
Note
Raft-wal needs to be written into the hard disk periodically. If hard disk bottlenecks to write, Raft will fail to send a heartbeat and conduct a new round of elections. If the hard disk IO is severely blocked, there will be no Leader for a long time.
Read and write: For every writing request of the clients, the Leader will initiate a Raft-wal and synchronize it with the Followers. Only after over half replicas have received the Raft-wal will it return to the clients successfully. For every reading request of the clients, it will get to the Leader directly, while Followers will not be involved.
Failure: Scenario 1: Take a (space) cluster of a single replica as an example. If the system has only one replica, the Leader will be itself. If failure happens, the system will be completely unavailable. Scenario 2: Take a (space) cluster of three replicas as an example. If the system has three replicas, one of them will be the Leader and the rest will be the Followers. If the Leader fails, the rest two can still vote for a new Leader (and a Follower), and the system is still available. But if any of the two Followers fails again, the system will be completely unavailable due to inadequate voters.
Note
Raft and HDFS have different modes of duplication. Raft is based on a quorum vote, so the number of replicas cannot be even.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#multi_group_raft","title":"Multi Group Raft","text":"The Storage Service supports a distributed cluster architecture, so NebulaGraph implements Multi Group Raft according to Raft protocol. Each Raft group stores all the replicas of each partition. One replica is the leader, while others are followers. In this way, NebulaGraph achieves strong consistency and high availability. The functions of Raft are as follows.
NebulaGraph uses Multi Group Raft to improve performance when there are many partitions because Raft-wal cannot be NULL. When there are too many partitions, costs will increase, such as storing information in Raft group, WAL files, or batch operation in low load.
There are two key points to implement the Multi Raft Group:
To share transport layer
Each Raft Group sends messages to its corresponding peers. So if the transport layer cannot be shared, the connection costs will be very high.
To share thread pool
Raft Groups share the same thread pool to prevent starting too many threads and a high context switch cost.
For each partition, it is necessary to do a batch to improve throughput when writing the WAL serially. As NebulaGraph uses WAL to implement some special functions, batches need to be grouped, which is a feature of NebulaGraph.
For example, lock-free CAS operations will execute after all the previous WALs are committed. So for a batch, if there are several WALs in CAS type, we need to divide this batch into several smaller groups and make sure they are committed serially.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#transfer_leadership","title":"Transfer Leadership","text":"Transfer leadership is extremely important for balance. When moving a partition from one machine to another, NebulaGraph first checks if the source is a leader. If so, it should be moved to another peer. After data migration is completed, it is important to balance leader distribution again.
When a transfer leadership command is committed, the leader will abandon its leadership and the followers will start a leader election.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#peer_changes","title":"Peer changes","text":"To avoid split-brain, when members in a Raft Group change, an intermediate state is required. In such a state, the quorum of the old group and new group always have an overlap. Thus it prevents the old or new group from making decisions unilaterally. To make it even simpler, in his doctoral thesis Diego Ongaro suggests adding or removing a peer once to ensure the overlap between the quorum of the new group and the old group. NebulaGraph also uses this approach, except that the way to add or remove a member is different. For details, please refer to addPeer/removePeer in the Raft Part class.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#differences_with_hdfs","title":"Differences with HDFS","text":"The Storage Service is a Raft-based distributed architecture, which has certain differences with that of HDFS. For example:
In a word, the Storage Service is more lightweight with some functions simplified and its architecture is simpler than HDFS, which can effectively improve the read and write performance of a smaller block of data.
"},{"location":"14.client/1.nebula-client/","title":"Clients overview","text":"NebulaGraph supports multiple types of clients for users to connect to and manage the NebulaGraph database.
Note
Only the following classes are thread-safe:
NebulaGraph CPP is a C++ client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/3.nebula-cpp-client/#prerequisites","title":"Prerequisites","text":"You have installed C++ and GCC 4.8 or later versions.
"},{"location":"14.client/3.nebula-cpp-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/3.nebula-cpp-client/#install_nebulagraph_cpp","title":"Install NebulaGraph CPP","text":"This document describes how to install NebulaGraph CPP with the source code.
"},{"location":"14.client/3.nebula-cpp-client/#prerequisites_1","title":"Prerequisites","text":"Clone the NebulaGraph CPP source code to the host.
(Recommended) To install a specific version of NebulaGraph CPP, use the Git option --branch
to specify the branch. For example, to install v3.4.0, run the following command:
$ git clone --branch release-3.4 https://github.com/vesoft-inc/nebula-cpp.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-cpp.git\n
Change the working directory to nebula-cpp
.
$ cd nebula-cpp\n
Create a directory named build
and change the working directory to it.
$ mkdir build && cd build\n
Generate the makefile
file with CMake.
Note
The default installation path is /usr/local/nebula
. To modify it, add the -DCMAKE_INSTALL_PREFIX=<installation_path>
option while running the following command.
$ cmake -DCMAKE_BUILD_TYPE=Release ..\n
Note
If G++ does not support C++ 11, add the option -DDISABLE_CXX11_ABI=ON
.
Compile NebulaGraph CPP.
To speed up the compiling, use the -j
option to set a concurrent number N
. It should be \\(\\min(\\text{CPU core number},\\frac{\\text{the memory size(GB)}}{2})\\).
$ make -j{N}\n
Install NebulaGraph CPP.
$ sudo make install\n
Update the dynamic link library.
$ sudo ldconfig\n
Compile the CPP file to an executable file, then you can use it. The following steps take using SessionExample.cpp
for example.
Use the example code to create the SessionExample.cpp
file.
Run the following command to compile the file.
$ LIBRARY_PATH=<library_folder_path>:$LIBRARY_PATH g++ -std=c++11 SessionExample.cpp -I<include_folder_path> -lnebula_graph_client -o session_example\n
library_folder_path
: The storage path of the NebulaGraph dynamic libraries. The default path is /usr/local/nebula/lib64
.include_folder_path
: The storage of the NebulaGraph header files. The default path is /usr/local/nebula/include
.For example:
$ LIBRARY_PATH=/usr/local/nebula/lib64:$LIBRARY_PATH g++ -std=c++11 SessionExample.cpp -I/usr/local/nebula/include -lnebula_graph_client -o session_example\n
"},{"location":"14.client/3.nebula-cpp-client/#api_reference","title":"API reference","text":"Click here to check the classes and functions provided by the CPP Client.
"},{"location":"14.client/3.nebula-cpp-client/#core_of_the_example_code","title":"Core of the example code","text":"Nebula CPP clients provide both Session Pool and Connection Pool methods to connect to NebulaGraph. Using the Connection Pool method requires users to manage session instances by themselves.
Session Pool
For more details about all the code, see SessionPoolExample.
Connection Pool
For more details about all the code, see SessionExample.
NebulaGraph Java is a Java client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/4.nebula-java-client/#prerequisites","title":"Prerequisites","text":"You have installed Java 8.0 or later versions.
"},{"location":"14.client/4.nebula-java-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/4.nebula-java-client/#download_nebulagraph_java","title":"Download NebulaGraph Java","text":"(Recommended) To install a specific version of NebulaGraph Java, use the Git option --branch
to specify the branch. For example, to install v3.6.1, run the following command:
$ git clone --branch release-3.6 https://github.com/vesoft-inc/nebula-java.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-java.git\n
Note
We recommend that each thread use one session. If multiple threads use the same session, the performance will be reduced.
When importing a Maven project with tools such as IDEA, set the following dependency in pom.xml
.
Note
3.0.0-SNAPSHOT
indicates the daily development version that may have unknown issues. We recommend that you replace 3.0.0-SNAPSHOT
with a released version number to use a table version.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>client</artifactId>\n <version>3.0.0-SNAPSHOT</version>\n</dependency>\n
If you cannot download the dependency for the daily development version, set the following content in pom.xml
. Released versions have no such issue.
<repositories> \n <repository> \n <id>snapshots</id> \n <url>https://oss.sonatype.org/content/repositories/snapshots/</url> \n </repository> \n</repositories>\n
If there is no Maven to manage the project, manually download the JAR file to install NebulaGraph Java.
"},{"location":"14.client/4.nebula-java-client/#api_reference","title":"API reference","text":"Click here to check the classes and functions provided by the Java Client.
"},{"location":"14.client/4.nebula-java-client/#core_of_the_example_code","title":"Core of the example code","text":"The NebulaGraph Java client provides both Connection Pool and Session Pool modes, using Connection Pool requires the user to manage session instances.
Session Pool
For all the code, see GraphSessionPoolExample.
Connection Pool
For all the code, see GraphClientExample.
NebulaGraph Python is a Python client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/5.nebula-python-client/#prerequisites","title":"Prerequisites","text":"You have installed Python 3.6 or later versions.
"},{"location":"14.client/5.nebula-python-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python","title":"Install NebulaGraph Python","text":""},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python_with_pip","title":"Install NebulaGraph Python with pip","text":"$ pip install nebula3-python==<version>\n
"},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python_from_the_source_code","title":"Install NebulaGraph Python from the source code","text":"Clone the NebulaGraph Python source code to the host.
(Recommended) To install a specific version of NebulaGraph Python, use the Git option --branch
to specify the branch. For example, to install v3.4.0, run the following command:
$ git clone --branch release-3.4 https://github.com/vesoft-inc/nebula-python.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-python.git\n
Change the working directory to nebula-python.
$ cd nebula-python\n
Run the following command to install NebulaGraph Python.
$ pip install .\n
Click here to check the classes and functions provided by the Python Client.
"},{"location":"14.client/5.nebula-python-client/#core_of_the_example_code","title":"Core of the example code","text":"NebulaGraph Python clients provides Connection Pool and Session Pool methods to connect to NebulaGraph. Using the Connection Pool method requires users to manage sessions by themselves.
Session Pool
For details about all the code, see SessinPoolExample.py.
For limitations of using the Session Pool method, see Example of using session pool.
Connection Pool
For details about all the code, see Example.
NebulaGraph Go is a Golang client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/6.nebula-go-client/#prerequisites","title":"Prerequisites","text":"You have installed Golang 1.13 or later versions.
"},{"location":"14.client/6.nebula-go-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/6.nebula-go-client/#download_nebulagraph_go","title":"Download NebulaGraph Go","text":"(Recommended) To install a specific version of NebulaGraph Go, use the Git option --branch
to specify the branch. For example, to install v3.7.0, run the following command:
$ git clone --branch release-3.7 https://github.com/vesoft-inc/nebula-go.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-go.git\n
Run the following command to install or update NebulaGraph Go:
$ go get -u -v github.com/vesoft-inc/nebula-go/v3@v3.7.0\n
"},{"location":"14.client/6.nebula-go-client/#api_reference","title":"API reference","text":"Click here to check the functions and types provided by the GO Client.
"},{"location":"14.client/6.nebula-go-client/#core_of_the_example_code","title":"Core of the example code","text":"The NebulaGraph GO client provides both Connection Pool and Session Pool, using Connection Pool requires the user to manage the session instances.
Session Pool
For details about all the code, see session_pool_example.go.
For limitations of using Session Pool, see Usage example.
Connection Pool
For all the code, see graph_client_basic_example and graph_client_goroutines_example.
You can use the following clients developed by community users to connect to and manage NebulaGraph:
You are welcome to contribute any code or files to the project. But firstly we suggest you raise an issue on the github or the forum to start a discussion with the community. Check through the topic for Github.
"},{"location":"15.contribution/how-to-contribute/#sign_the_contributor_license_agreement_cla","title":"Sign the Contributor License Agreement CLA","text":"If you have any questions, submit an issue.
"},{"location":"15.contribution/how-to-contribute/#modify_a_single_document","title":"Modify a single document","text":"This manual is written in the Markdown language. Click the pencil
icon on the right of the document title to commit the modification.
This method applies to modifying a single document only.
"},{"location":"15.contribution/how-to-contribute/#batch_modify_or_add_files","title":"Batch modify or add files","text":"This method applies to contributing code, modifying multiple documents in batches, or adding new documents.
"},{"location":"15.contribution/how-to-contribute/#step_1_fork_in_the_githubcom","title":"Step 1: Fork in the github.com","text":"The NebulaGraph project has many repositories. Take the nebul repository for example:
Visit https://github.com/vesoft-inc/nebula.
Click the Fork
button to establish an online fork.
Define a local working directory.
# Define the working directory.\nworking_dir=$HOME/Workspace\n
Set user
to match the Github profile name.
user={the Github profile name}\n
Create your clone.
mkdir -p $working_dir\ncd $working_dir\ngit clone https://github.com/$user/nebula.git\n# or: git clone git@github.com:$user/nebula.git\n\ncd $working_dir/nebula\ngit remote add upstream https://github.com/vesoft-inc/nebula.git\n# or: git remote add upstream git@github.com:vesoft-inc/nebula.git\n\n# Never push to upstream master since you do not have write access.\ngit remote set-url --push upstream no_push\n\n# Confirm that the remote branch is valid.\n# The correct format is:\n# origin git@github.com:$(user)/nebula.git (fetch)\n# origin git@github.com:$(user)/nebula.git (push)\n# upstream https://github.com/vesoft-inc/nebula (fetch)\n# upstream no_push (push)\ngit remote -v\n
(Optional) Define a pre-commit hook.
Please link the NebulaGraph pre-commit hook into the .git
directory.
This hook checks the commits for formatting, building, doc generation, etc.
cd $working_dir/nebula/.git/hooks\nln -s $working_dir/nebula/.linters/cpp/hooks/pre-commit.sh .\n
Sometimes, the pre-commit hook cannot be executed. You have to execute it manually.
cd $working_dir/nebula/.git/hooks\nchmod +x pre-commit\n
Get your local master up to date.
cd $working_dir/nebula\ngit fetch upstream\ngit checkout master\ngit rebase upstream/master\n
Checkout a new branch from master.
git checkout -b myfeature\n
Note
Because the PR often consists of several commits, which might be squashed while being merged into upstream. We strongly suggest you to open a separate topic branch to make your changes on. After merged, this topic branch can be just abandoned, thus you could synchronize your master branch with upstream easily with a rebase like above. Otherwise, if you commit your changes directly into master, you need to use a hard reset on the master branch. For example:
git fetch upstream\ngit checkout master\ngit reset --hard upstream/master\ngit push --force origin master\n
Code style
NebulaGraph adopts cpplint
to make sure that the project conforms to Google's coding style guides. The checker will be implemented before the code is committed.
Unit tests requirements
Please add unit tests for the new features or bug fixes.
Build your code with unit tests enabled
For more information, see Install NebulaGraph by compiling the source code.
Note
Make sure you have enabled the building of unit tests by setting -DENABLE_TESTING=ON
.
Run tests
In the root directory of nebula
, run the following command:
cd nebula/build\nctest -j$(nproc)\n
# While on your myfeature branch.\ngit fetch upstream\ngit rebase upstream/master\n
Users need to bring the head branch up to date after other contributors merge PR to the base branch.
"},{"location":"15.contribution/how-to-contribute/#step_6_commit","title":"Step 6: Commit","text":"Commit your changes.
git commit -a\n
Users can use the command --amend
to re-edit the previous code.
When ready to review or just to establish an offsite backup, push your branch to your fork on github.com
:
git push origin myfeature\n
"},{"location":"15.contribution/how-to-contribute/#step_8_create_a_pull_request","title":"Step 8: Create a Pull Request","text":"Visit your fork at https://github.com/$user/nebula
(replace $user
here).
Click the Compare & pull request
button next to your myfeature
branch.
Once your pull request has been created, it will be assigned to at least two reviewers. Those reviewers will do a thorough code review to make sure that the changes meet the repository's contributing guidelines and other quality standards.
"},{"location":"15.contribution/how-to-contribute/#add_test_cases","title":"Add test cases","text":"For detailed methods, see How to add test cases.
"},{"location":"15.contribution/how-to-contribute/#donation","title":"Donation","text":""},{"location":"15.contribution/how-to-contribute/#step_1_confirm_the_project_donation","title":"Step 1: Confirm the project donation","text":"Contact the official NebulaGraph staff via email, WeChat, Slack, etc. to confirm the donation project. The project will be donated to the NebulaGraph Contrib organization.
Email address: info@vesoft.com
WeChat: NebulaGraphbot
Slack: Join Slack
"},{"location":"15.contribution/how-to-contribute/#step_2_get_the_information_of_the_project_recipient","title":"Step 2: Get the information of the project recipient","text":"The NebulaGraph official staff will give the recipient ID of the NebulaGraph Contrib project.
"},{"location":"15.contribution/how-to-contribute/#step_3_donate_a_project","title":"Step 3: Donate a project","text":"The user transfers the project to the recipient of this donation, and the recipient transfers the project to the NebulaGraph Contrib organization. After the donation, the user will continue to lead the development of community projects as a Maintainer.
For operations of transferring a repository on GitHub, see Transferring a repository owned by your user account.
"},{"location":"2.quick-start/1.quick-start-workflow/","title":"Quickly deploy NebulaGraph using Docker","text":"You can quickly get started with NebulaGraph by deploying NebulaGraph with Docker Desktop or Docker Compose.
Using Docker DesktopUsing Docker ComposeNebulaGraph is available as a Docker Extension that you can easily install and run on your Docker Desktop. You can quickly deploy NebulaGraph using Docker Desktop with just one click.
Install Docker Desktop.
Caution
We do not recommend you deploy NebulaGraph on Docker Desktop for Windows due to its subpar performance. For details, see #12401. If you must use Docker Desktop for Windows, install WSL 2 first.
In the left sidebar of Docker Desktop, click Extensions or Add Extensions.
On the Extensions Marketplace, search for NebulaGraph and click Install.
Click Update to update NebulaGraph to the latest version when a new version is available.
Click Open to navigate to the NebulaGraph extension page.
At the top of the page, click Studio in Browser to use NebulaGraph.
For more information about how to use NebulaGraph with Docker Desktop, see the following video:
Using Docker Compose can quickly deploy NebulaGraph services based on the prepared configuration file. It is only recommended to use this method when testing the functions of NebulaGraph.
"},{"location":"2.quick-start/1.quick-start-workflow/#prerequisites","title":"Prerequisites","text":"You have installed the following applications on your host.
Application Recommended version Official installation reference Docker Latest Install Docker Engine Docker Compose Latest Install Docker Compose Git Latest Download Gitnebula-docker-compose/data
directory.Clone the 3.6.0
branch of the nebula-docker-compose
repository to your host with Git.
Danger
The master
branch contains the untested code for the latest NebulaGraph development release. DO NOT use this release in a production environment.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-docker-compose.git\n
Note
The x.y
version of Docker Compose aligns to the x.y
version of NebulaGraph. For the NebulaGraph z
version, Docker Compose does not publish the corresponding z
version, but pulls the z
version of the NebulaGraph image.
Go to the nebula-docker-compose
directory.
$ cd nebula-docker-compose/\n
Run the following command to start all the NebulaGraph services.
Note
[nebula-docker-compose]$ docker-compose up -d\nCreating nebula-docker-compose_metad0_1 ... done\nCreating nebula-docker-compose_metad2_1 ... done\nCreating nebula-docker-compose_metad1_1 ... done\nCreating nebula-docker-compose_graphd2_1 ... done\nCreating nebula-docker-compose_graphd_1 ... done\nCreating nebula-docker-compose_graphd1_1 ... done\nCreating nebula-docker-compose_storaged0_1 ... done\nCreating nebula-docker-compose_storaged2_1 ... done\nCreating nebula-docker-compose_storaged1_1 ... done\n
Compatibility
Starting from NebulaGraph version 3.1.0, nebula-docker-compose automatically starts a NebulaGraph Console docker container and adds the storage host to the cluster (i.e. ADD HOSTS
command).
Note
For more information of the preceding services, see NebulaGraph architecture.
There are two ways to connect to NebulaGraph:
9669
in the container's configuration file, you can connect directly through the default port. For details, see Connect to NebulaGraph.Run the following command to view the name of NebulaGraph Console docker container.
$ docker-compose ps\n Name Command State Ports\n--------------------------------------------------------------------------------------------\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\n......\n
Run the following command to enter the NebulaGraph Console docker container.
docker exec -it nebula-docker-compose_console_1 /bin/sh\n/ #\n
Connect to NebulaGraph with NebulaGraph Console.
/ # ./usr/local/bin/nebula-console -u <user_name> -p <password> --address=graphd --port=9669\n
Note
By default, the authentication is off, you can only log in with an existing username (the default is root
) and any password. To turn it on, see Enable authentication.
Run the following commands to view the cluster state.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n
Run exit
twice to switch back to your terminal (shell).
Run docker-compose ps
to list all the services of NebulaGraph and their status and ports.
Note
NebulaGraph provides services to the clients through port 9669
by default. To use other ports, modify the docker-compose.yaml
file in the nebula-docker-compose
directory and restart the NebulaGraph services.
$ docker-compose ps\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49174->19669/tcp,:::49174->19669/tcp, 0.0.0.0:49171->19670/tcp,:::49171->19670/tcp, 0.0.0.0:49177->9669/tcp,:::49177->9669/tcp\nnebula-docker-compose_graphd2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49175->19669/tcp,:::49175->19669/tcp, 0.0.0.0:49172->19670/tcp,:::49172->19670/tcp, 0.0.0.0:49178->9669/tcp,:::49178->9669/tcp\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49180->19669/tcp,:::49180->19669/tcp, 0.0.0.0:49179->19670/tcp,:::49179->19670/tcp, 0.0.0.0:9669->9669/tcp,:::9669->9669/tcp\nnebula-docker-compose_metad0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49157->19559/tcp,:::49157->19559/tcp, 0.0.0.0:49154->19560/tcp,:::49154->19560/tcp, 0.0.0.0:49160->9559/tcp,:::49160->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49156->19559/tcp,:::49156->19559/tcp, 0.0.0.0:49153->19560/tcp,:::49153->19560/tcp, 0.0.0.0:49159->9559/tcp,:::49159->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49158->19559/tcp,:::49158->19559/tcp, 0.0.0.0:49155->19560/tcp,:::49155->19560/tcp, 0.0.0.0:49161->9559/tcp,:::49161->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49166->19779/tcp,:::49166->19779/tcp, 0.0.0.0:49163->19780/tcp,:::49163->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49169->9779/tcp,:::49169->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49165->19779/tcp,:::49165->19779/tcp, 0.0.0.0:49162->19780/tcp,:::49162->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49168->9779/tcp,:::49168->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49167->19779/tcp,:::49167->19779/tcp, 0.0.0.0:49164->19780/tcp,:::49164->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49170->9779/tcp,:::49170->9779/tcp, 9780/tcp\n
If the service is abnormal, you can first confirm the abnormal container name (such as nebula-docker-compose_graphd2_1
).
Then you can execute docker ps
to view the corresponding CONTAINER ID
(such as 2a6c56c405f5
).
[nebula-docker-compose]$ docker ps\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES\n2a6c56c405f5 vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:49230->9669/tcp, 0.0.0.0:49229->19669/tcp, 0.0.0.0:49228->19670/tcp nebula-docker-compose_graphd2_1\n7042e0a8e83d vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49227->9779/tcp, 0.0.0.0:49226->19779/tcp, 0.0.0.0:49225->19780/tcp nebula-docker-compose_storaged2_1\n18e3ea63ad65 vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49219->9779/tcp, 0.0.0.0:49218->19779/tcp, 0.0.0.0:49217->19780/tcp nebula-docker-compose_storaged0_1\n4dcabfe8677a vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:49224->9669/tcp, 0.0.0.0:49223->19669/tcp, 0.0.0.0:49222->19670/tcp nebula-docker-compose_graphd1_1\na74054c6ae25 vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:9669->9669/tcp, 0.0.0.0:49221->19669/tcp, 0.0.0.0:49220->19670/tcp nebula-docker-compose_graphd_1\n880025a3858c vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49216->9779/tcp, 0.0.0.0:49215->19779/tcp, 0.0.0.0:49214->19780/tcp nebula-docker-compose_storaged1_1\n45736a32a23a vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49213->9559/tcp, 0.0.0.0:49212->19559/tcp, 0.0.0.0:49211->19560/tcp nebula-docker-compose_metad0_1\n3b2c90eb073e vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49207->9559/tcp, 0.0.0.0:49206->19559/tcp, 0.0.0.0:49205->19560/tcp nebula-docker-compose_metad2_1\n7bb31b7a5b3f vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49210->9559/tcp, 0.0.0.0:49209->19559/tcp, 0.0.0.0:49208->19560/tcp nebula-docker-compose_metad1_1\n
Use the CONTAINER ID
to log in the container and troubleshoot.
nebula-docker-compose]$ docker exec -it 2a6c56c405f5 bash\n[root@2a6c56c405f5 nebula]#\n
"},{"location":"2.quick-start/1.quick-start-workflow/#check_the_service_data_and_logs","title":"Check the service data and logs","text":"All the data and logs of NebulaGraph are stored persistently in the nebula-docker-compose/data
and nebula-docker-compose/logs
directories.
The structure of the directories is as follows:
nebula-docker-compose/\n |-- docker-compose.yaml\n \u251c\u2500\u2500 data\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta2\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 storage2\n \u2514\u2500\u2500 logs\n \u251c\u2500\u2500 graph\n \u251c\u2500\u2500 graph1\n \u251c\u2500\u2500 graph2\n \u251c\u2500\u2500 meta0\n \u251c\u2500\u2500 meta1\n \u251c\u2500\u2500 meta2\n \u251c\u2500\u2500 storage0\n \u251c\u2500\u2500 storage1\n \u2514\u2500\u2500 storage2\n
"},{"location":"2.quick-start/1.quick-start-workflow/#stop_the_nebulagraph_services","title":"Stop the NebulaGraph services","text":"You can run the following command to stop the NebulaGraph services:
$ docker-compose down\n
The following information indicates you have successfully stopped the NebulaGraph services:
Stopping nebula-docker-compose_console_1 ... done\nStopping nebula-docker-compose_graphd1_1 ... done\nStopping nebula-docker-compose_graphd_1 ... done\nStopping nebula-docker-compose_graphd2_1 ... done\nStopping nebula-docker-compose_storaged1_1 ... done\nStopping nebula-docker-compose_storaged0_1 ... done\nStopping nebula-docker-compose_storaged2_1 ... done\nStopping nebula-docker-compose_metad2_1 ... done\nStopping nebula-docker-compose_metad0_1 ... done\nStopping nebula-docker-compose_metad1_1 ... done\nRemoving nebula-docker-compose_console_1 ... done\nRemoving nebula-docker-compose_graphd1_1 ... done\nRemoving nebula-docker-compose_graphd_1 ... done\nRemoving nebula-docker-compose_graphd2_1 ... done\nRemoving nebula-docker-compose_storaged1_1 ... done\nRemoving nebula-docker-compose_storaged0_1 ... done\nRemoving nebula-docker-compose_storaged2_1 ... done\nRemoving nebula-docker-compose_metad2_1 ... done\nRemoving nebula-docker-compose_metad0_1 ... done\nRemoving nebula-docker-compose_metad1_1 ... done\nRemoving network nebula-docker-compose_nebula-net\n
Danger
The parameter -v
in the command docker-compose down -v
will delete all your local NebulaGraph storage data. Try this command if you are using the nightly release and having some compatibility issues.
The configuration file of NebulaGraph deployed by Docker Compose is nebula-docker-compose/docker-compose.yaml
. To make the new configuration take effect, modify the configuration in this file and restart the service.
For more instructions, see Configurations.
"},{"location":"2.quick-start/1.quick-start-workflow/#faq","title":"FAQ","text":""},{"location":"2.quick-start/1.quick-start-workflow/#how_to_fix_the_docker_mapping_to_external_ports","title":"How to fix the docker mapping to external ports?","text":"To set the ports
of corresponding services as fixed mapping, modify the docker-compose.yaml
in the nebula-docker-compose
directory. For example:
graphd:\n image: vesoft/nebula-graphd:release-3.6\n ...\n ports:\n - 9669:9669\n - 19669\n - 19670\n
9669:9669
indicates the internal port 9669 is uniformly mapped to external ports, while 19669
indicates the internal port 19669 is randomly mapped to external ports.
In the nebula-docker-compose/docker-compose.yaml
file, change all the image
values to the required image version.
In the nebula-docker-compose
directory, run docker-compose pull
to update the images of the Graph Service, Storage Service, Meta Service, and NebulaGraph Console.
Run docker-compose up -d
to start the NebulaGraph services again.
After connecting to NebulaGraph with NebulaGraph Console, run SHOW HOSTS GRAPH
, SHOW HOSTS STORAGE
, or SHOW HOSTS META
to check the version of the responding service respectively.
ERROR: toomanyrequests
when docker-compose pull
","text":"You may meet the following error.
ERROR: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
.
You have met the rate limit of Docker Hub. Learn more on Understanding Docker Hub Rate Limiting.
"},{"location":"2.quick-start/1.quick-start-workflow/#how_to_update_the_nebulagraph_console_client","title":"How to update the NebulaGraph Console client","text":"The command docker-compose pull
updates both the NebulaGraph services and the NebulaGraph Console.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"2.quick-start/2.install-nebula-graph/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
When connecting to NebulaGraph for the first time, you have to add the Storage hosts, and confirm that all the hosts are online.
Compatibility
ADD HOSTS
before reading or writing data into the Storage Service.ADD HOSTS
is not needed. You have connected to NebulaGraph.
"},{"location":"2.quick-start/3.1add-storage-hosts/#steps","title":"Steps","text":"Add the Storage hosts.
Run the following command to add hosts:
ADD HOSTS <ip>:<port> [,<ip>:<port> ...];\n
Example\uff1a
nebula> ADD HOSTS 192.168.10.100:9779, 192.168.10.101:9779, 192.168.10.102:9779;\n
Caution
Make sure that the IP you added is the same as the IP configured for local_ip
in the nebula-storaged.conf
file. Otherwise, the Storage service will fail to start. For information about configurations, see Configurations.
Check the status of the hosts to make sure that they are all online.
nebula> SHOW HOSTS;\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n| \"192.168.10.100\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.101\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\"|\n| \"192.168.10.102\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\"|\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n
The Status
column of the result above shows that all Storage hosts are online.
This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"2.quick-start/3.connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
This topic will describe the basic CRUD operations in NebulaGraph.
For more information, see nGQL guide.
"},{"location":"2.quick-start/4.nebula-graph-crud/#graph_space_and_nebulagraph_schema","title":"Graph space and NebulaGraph schema","text":"A NebulaGraph instance consists of one or more graph spaces. Graph spaces are physically isolated from each other. You can use different graph spaces in the same instance to store different datasets.
To insert data into a graph space, define a schema for the graph database. NebulaGraph schema is based on the following components.
Schema component Description Vertex Represents an entity in the real world. A vertex can have zero to multiple tags. Tag The type of the same group of vertices. It defines a set of properties that describes the types of vertices. Edge Represents a directed relationship between two vertices. Edge type The type of an edge. It defines a group of properties that describes the types of edges.For more information, see Data modeling.
In this topic, we will use the following dataset to demonstrate basic CRUD operations.
"},{"location":"2.quick-start/4.nebula-graph-crud/#async_implementation_of_create_and_alter","title":"Async implementation ofCREATE
and ALTER
","text":"Caution
In NebulaGraph, the following CREATE
or ALTER
commands are implemented in an async way and take effect in the next heartbeat cycle. Otherwise, an error will be returned. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
CREATE SPACE
CREATE TAG
CREATE EDGE
ALTER TAG
ALTER EDGE
CREATE TAG INDEX
CREATE EDGE INDEX
Note
The default heartbeat interval is 10 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
CREATE SPACE [IF NOT EXISTS] <graph_space_name> (\n[partition_num = <partition_number>,]\n[replica_factor = <replica_number>,]\nvid_type = {FIXED_STRING(<N>) | INT64}\n)\n[COMMENT = '<comment>'];\n
For more information on parameters, see CREATE SPACE.
nebula> SHOW SPACES;\n
USE <graph_space_name>;\n
Use the following statement to create a graph space named basketballplayer
.
nebula> CREATE SPACE basketballplayer(partition_num=15, replica_factor=1, vid_type=fixed_string(30));\n
Note
If the system returns the error [ERROR (-1005)]: Host not enough!
, check whether registered the Storage Service.
Check the partition distribution with SHOW HOSTS
to make sure that the partitions are distributed in a balanced way.
nebula> SHOW HOSTS;\n+-------------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n| \"storaged1\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n| \"storaged2\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n+-------------+-----------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n
If the Leader distribution is uneven, use BALANCE LEADER
to redistribute the partitions. For more information, see BALANCE.
Use the basketballplayer
graph space.
nebula[(none)]> USE basketballplayer;\n
You can use SHOW SPACES
to check the graph space you created.
nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"basketballplayer\" |\n+--------------------+\n
CREATE {TAG | EDGE} [IF NOT EXISTS] {<tag_name> | <edge_type_name>}\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
For more information on parameters, see CREATE TAG and CREATE EDGE.
"},{"location":"2.quick-start/4.nebula-graph-crud/#examples_1","title":"Examples","text":"Create tags player
and team
, and edge types follow
and serve
. Descriptions are as follows.
nebula> CREATE TAG player(name string, age int);\n\nnebula> CREATE TAG team(name string);\n\nnebula> CREATE EDGE follow(degree int);\n\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
"},{"location":"2.quick-start/4.nebula-graph-crud/#insert_vertices_and_edges","title":"Insert vertices and edges","text":"You can use the INSERT
statement to insert vertices or edges based on existing tags or edge types.
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES <vid>: ([prop_value_list])\n\ntag_props:\n tag_name ([prop_name_list])\n\nprop_name_list:\n [prop_name [, prop_name] ...]\n\nprop_value_list:\n [prop_value [, prop_value] ...] \n
vid
is short for Vertex ID. A vid
must be a unique string value in a graph space. For details, see INSERT VERTEX.
Insert edges:
INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES \n<src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> )\n[, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...];\n<prop_name_list> ::=\n[ <prop_name> [, <prop_name> ] ...]\n<prop_value_list> ::=\n[ <prop_value> [, <prop_value> ] ...]\n
For more information on parameters, see INSERT EDGE.
nebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\n\nnebula> INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36);\n\nnebula> INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\n\nnebula> INSERT VERTEX team(name) VALUES \"team203\":(\"Trail Blazers\"), \"team204\":(\"Spurs\");\n
nebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player100\":(95);\n\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player102\":(90);\n\nnebula> INSERT EDGE follow(degree) VALUES \"player102\" -> \"player100\":(75);\n\nnebula> INSERT EDGE serve(start_year, end_year) VALUES \"player101\" -> \"team204\":(1999, 2018),\"player102\" -> \"team203\":(2006, 2015);\n
GO
traversal starts from one or more vertices, along one or more edges, and returns information in a form specified in the YIELD
clause.WHERE
clause to search for the data that meet the specific conditions.GO
GO [[<M> TO] <N> {STEP|STEPS} ] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions> ]\nYIELD [DISTINCT] <return_list>\n[{ SAMPLE <sample_list> | <limit_by_list_clause> }]\n[| GROUP BY {<col_name> | expression> | <position>} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset>,] <number_rows>];\n
FETCH
Fetch properties on tags:
FETCH PROP ON {<tag_name>[, tag_name ...] | *}\n<vid> [, vid ...]\nYIELD <return_list> [AS <alias>];\n
Fetch properties on edges:
FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
LOOKUP
LOOKUP ON {<vertex_tag> | <edge_type>}\n[WHERE <expression> [AND <expression> ...]]\nYIELD <return_list> [AS <alias>];\n<return_list>\n <prop_name> [AS <col_alias>] [, <prop_name> [AS <prop_alias>] ...];\n
MATCH
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
GO
statement","text":"player101
follows.nebula> GO FROM \"player101\" OVER follow YIELD id($$);\n+-------------+\n| id($$) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n+-------------+\n
player101
follows whose age is equal to or greater than 35. Rename the corresponding columns in the results with Teammate
and Age
.nebula> GO FROM \"player101\" OVER follow WHERE properties($$).age >= 35 \\\n YIELD properties($$).name AS Teammate, properties($$).age AS Age;\n+-----------------+-----+\n| Teammate | Age |\n+-----------------+-----+\n| \"Tim Duncan\" | 42 |\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
| Clause/Sign | Description | |-------------+---------------------------------------------------------------------| | YIELD
| Specifies what values or results you want to return from the query. | | $$
| Represents the target vertices. | | \\
| A line-breaker. |
Search for the players that the player with VID player101
follows. Then retrieve the teams of the players that the player with VID player100
follows. To combine the two queries, use a pipe or a temporary variable.
With a pipe:
nebula> GO FROM \"player101\" OVER follow YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------------+---------------------+\n| Team | Player |\n+-----------------+---------------------+\n| \"Spurs\" | \"Tim Duncan\" |\n| \"Trail Blazers\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------------+---------------------+\n
Clause/Sign Description $^
Represents the source vertex of the edge. |
A pipe symbol can combine multiple queries. $-
Represents the outputs of the query before the pipe symbol. With a temporary variable:
Note
Once a composite statement is submitted to the server as a whole, the life cycle of the temporary variables in the statement ends.
nebula> $var = GO FROM \"player101\" OVER follow YIELD dst(edge) AS id; \\\n GO FROM $var.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------------+---------------------+\n| Team | Player |\n+-----------------+---------------------+\n| \"Spurs\" | \"Tim Duncan\" |\n| \"Trail Blazers\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------------+---------------------+\n
FETCH
statement","text":"Use FETCH
: Fetch the properties of the player with VID player100
.
nebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+-------------------------------+\n| properties(VERTEX) |\n+-------------------------------+\n| {age: 42, name: \"Tim Duncan\"} |\n+-------------------------------+\n
Note
The examples of LOOKUP
and MATCH
statements are in indexes.
Users can use the UPDATE
or the UPSERT
statements to update existing data.
UPSERT
is the combination of UPDATE
and INSERT
. If you update a vertex or an edge with UPSERT
, the database will insert a new vertex or edge if it does not exist.
Note
UPSERT
operates serially in a partition-based order. Therefore, it is slower than INSERT
OR UPDATE
. And UPSERT
has concurrency only between multiple partitions.
UPDATE
vertices:UPDATE VERTEX <vid> SET <properties to be updated>\n[WHEN <condition>] [YIELD <columns>];\n
UPDATE
edges:UPDATE EDGE ON <edge_type> <source vid> -> <destination vid> [@rank] \nSET <properties to be updated> [WHEN <condition>] [YIELD <columns to be output>];\n
UPSERT
vertices or edges:UPSERT {VERTEX <vid> | EDGE <edge_type>} SET <update_columns>\n[WHEN <condition>] [YIELD <columns>];\n
UPDATE
the name
property of the vertex with VID player100
and check the result with the FETCH
statement.nebula> UPDATE VERTEX \"player100\" SET player.name = \"Tim\";\n\nnebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+------------------------+\n| properties(VERTEX) |\n+------------------------+\n| {age: 42, name: \"Tim\"} |\n+------------------------+\n
UPDATE
the degree
property of an edge and check the result with the FETCH
statement.nebula> UPDATE EDGE ON follow \"player101\" -> \"player100\" SET degree = 96;\n\nnebula> FETCH PROP ON follow \"player101\" -> \"player100\" YIELD properties(edge);\n+------------------+\n| properties(EDGE) |\n+------------------+\n| {degree: 96} |\n+------------------+\n
player111
and UPSERT
it.nebula> INSERT VERTEX player(name,age) VALUES \"player111\":(\"David West\", 38);\n\nnebula> UPSERT VERTEX \"player111\" SET player.name = \"David\", player.age = $^.player.age + 11 \\\n WHEN $^.player.name == \"David West\" AND $^.player.age > 20 \\\n YIELD $^.player.name AS Name, $^.player.age AS Age;\n+---------+-----+\n| Name | Age |\n+---------+-----+\n| \"David\" | 49 |\n+---------+-----+\n
DELETE VERTEX <vid1>[, <vid2>...]\n
DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>]\n[, <src_vid> -> <dst_vid>...]\n
nebula> DELETE VERTEX \"player111\", \"team203\";\n
nebula> DELETE EDGE follow \"player101\" -> \"team204\";\n
Users can add indexes to tags and edge types with the CREATE INDEX statement.
Must-read for using indexes
Both MATCH
and LOOKUP
statements depend on the indexes. But indexes can dramatically reduce the write performance. DO NOT use indexes in production environments unless you are fully aware of their influences on your service.
Users MUST rebuild indexes for pre-existing data. Otherwise, the pre-existing data cannot be indexed and therefore cannot be returned in MATCH
or LOOKUP
statements. For more information, see REBUILD INDEX.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name>\nON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT = '<comment>'];\n
REBUILD {TAG | EDGE} INDEX <index_name>;\n
Note
Define the index length when creating an index for a variable-length property. In UTF-8 encoding, a non-ascii character occupies 3 bytes. You should set an appropriate index length according to the variable-length property. For example, the index should be 30 bytes for 10 non-ascii characters. For more information, see CREATE INDEX
"},{"location":"2.quick-start/4.nebula-graph-crud/#examples_of_lookup_and_match_index-based","title":"Examples ofLOOKUP
and MATCH
(index-based)","text":"Make sure there is an index for LOOKUP
or MATCH
to use. If there is not, create an index first.
Find the information of the vertex with the tag player
and its value of the name
property is Tony Parker
.
This example creates the index player_index_1
on the name
property.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_1 ON player(name(20));\n
This example rebuilds the index to make sure it takes effect on pre-existing data.
nebula> REBUILD TAG INDEX player_index_1\n+------------+\n| New Job Id |\n+------------+\n| 31 |\n+------------+\n
This example uses the LOOKUP
statement to retrieve the vertex property.
nebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name AS name, properties(vertex).age AS age;\n+---------------+-----+\n| name | age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n
This example uses the MATCH
statement to retrieve the vertex property.
nebula> MATCH (v:player{name:\"Tony Parker\"}) RETURN v;\n+-----------------------------------------------------+\n| v |\n+-----------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+-----------------------------------------------------+\n
"},{"location":"2.quick-start/5.start-stop-service/","title":"Step 2: Manage NebulaGraph Service","text":"NebulaGraph supports managing services with scripts.
"},{"location":"2.quick-start/5.start-stop-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"2.quick-start/5.start-stop-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"2.quick-start/5.start-stop-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"2.quick-start/5.start-stop-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"2.quick-start/6.cheatsheet-for-ngql/","title":"nGQL cheatsheet","text":""},{"location":"2.quick-start/6.cheatsheet-for-ngql/#functions","title":"Functions","text":"Math functions
Function Description double abs(double x) Returns the absolute value of the argument. double floor(double x) Returns the largest integer value smaller than or equal to the argument. (Rounds down) double ceil(double x) Returns the smallest integer greater than or equal to the argument. (Rounds up) double round(double x) Returns the integer value nearest to the argument. Returns a number farther away from 0 if the argument is in the middle. double sqrt(double x) Returns the square root of the argument. double cbrt(double x) Returns the cubic root of the argument. double hypot(double x, double y) Returns the hypotenuse of a right-angled triangle. double pow(double x, double y) Returns the result of xy. double exp(double x) Returns the result of ex. double exp2(double x) Returns the result of 2x. double log(double x) Returns the base-e logarithm of the argument. double log2(double x) Returns the base-2 logarithm of the argument. double log10(double x) Returns the base-10 logarithm of the argument. double sin(double x) Returns the sine of the argument. double asin(double x) Returns the inverse sine of the argument. double cos(double x) Returns the cosine of the argument. double acos(double x) Returns the inverse cosine of the argument. double tan(double x) Returns the tangent of the argument. double atan(double x) Returns the inverse tangent of the argument. double rand() Returns a random floating point number in the range from 0 (inclusive) to 1 (exclusive); i.e.[0,1). int rand32(int min, int max) Returns a random 32-bit integer in[min, max)
.If you set only one argument, it is parsed as max
and min
is 0
by default.If you set no argument, the system returns a random signed 32-bit integer. int rand64(int min, int max) Returns a random 64-bit integer in [min, max)
.If you set only one argument, it is parsed as max
and min
is 0
by default.If you set no argument, the system returns a random signed 64-bit integer. bit_and() Bitwise AND. bit_or() Bitwise OR. bit_xor() Bitwise XOR. int size() Returns the number of elements in a list or a map or the length of a string. int range(int start, int end, int step) Returns a list of integers from [start,end]
in the specified steps. step
is 1 by default. int sign(double x) Returns the signum of the given number.If the number is 0
, the system returns 0
.If the number is negative, the system returns -1
.If the number is positive, the system returns 1
. double e() Returns the base of the natural logarithm, e (2.718281828459045). double pi() Returns the mathematical constant pi (3.141592653589793). double radians() Converts degrees to radians. radians(180)
returns 3.141592653589793
. Aggregating functions
Function Description avg() Returns the average value of the argument. count() Syntax:count({expr | *})
.count()
returns the number of rows (including NULL). count(expr)
returns the number of non-NULL values that meet the expression. count() and size() are different. max() Returns the maximum value. min() Returns the minimum value. collect() The collect() function returns a list containing the values returned by an expression. Using this function aggregates data by merging multiple records or values into a single list. std() Returns the population standard deviation. sum() Returns the sum value. String functions
Function Description int strcasecmp(string a, string b) Compares string a and b without case sensitivity. When a = b, the return string lower(string a) Returns the argument in lowercase. string toLower(string a) The same aslower()
. string upper(string a) Returns the argument in uppercase. string toUpper(string a) The same as upper()
. int length(a) Returns the length of the given string in bytes or the length of a path in hops. string trim(string a) Removes leading and trailing spaces. string ltrim(string a) Removes leading spaces. string rtrim(string a) Removes trailing spaces. string left(string a, int count) Returns a substring consisting of count
characters from the left side of string right(string a, int count) Returns a substring consisting of count
characters from the right side of string lpad(string a, int size, string letters) Left-pads string a with string letters
and returns a string rpad(string a, int size, string letters) Right-pads string a with string letters
and returns a string substr(string a, int pos, int count) Returns a substring extracting count
characters starting from string substring(string a, int pos, int count) The same as substr()
. string reverse(string) Returns a string in reverse order. string replace(string a, string b, string c) Replaces string b in string a with string c. list split(string a, string b) Splits string a at string b and returns a list of strings. concat() The concat()
function requires at least two or more strings. All the parameters are concatenated into one string.Syntax: concat(string1,string2,...)
concat_ws() The concat_ws()
function connects two or more strings with a predefined separator. extract() extract()
uses regular expression matching to retrieve a single substring or all substrings from a string. json_extract() The json_extract()
function converts the specified JSON string to map. Data and time functions
Function Description int now() Returns the current timestamp of the system. timestamp timestamp() Returns the current timestamp of the system. date date() Returns the current UTC date based on the current system. time time() Returns the current UTC time based on the current system. datetime datetime() Returns the current UTC date and time based on the current system.Schema-related functions
For nGQL statements
Function Description id(vertex) Returns the ID of a vertex. The data type of the result is the same as the vertex ID. map properties(vertex) Returns the properties of a vertex. map properties(edge) Returns the properties of an edge. string type(edge) Returns the edge type of an edge. src(edge) Returns the source vertex ID of an edge. The data type of the result is the same as the vertex ID. dst(edge) Returns the destination vertex ID of an edge. The data type of the result is the same as the vertex ID. int rank(edge) Returns the rank value of an edge. vertex Returns the information of vertices, including VIDs, tags, properties, and values. edge Returns the information of edges, including edge types, source vertices, destination vertices, ranks, properties, and values. vertices Returns the information of vertices in a subgraph. For more information, see GET SUBGRAPH. edges Returns the information of edges in a subgraph. For more information, see GET SUBGRAPH. path Returns the information of a path. For more information, see FIND PATH.For statements compatible with openCypher
Function Description id(<vertex>) Returns the ID of a vertex. The data type of the result is the same as the vertex ID. list tags(<vertex>) Returns the Tag of a vertex, which serves the same purpose as labels(). list labels(<vertex>) Returns the Tag of a vertex, which serves the same purpose as tags(). This function is used for compatibility with openCypher syntax. map properties(<vertex_or_edge>) Returns the properties of a vertex or an edge. string type(<edge>) Returns the edge type of an edge. src(<edge>) Returns the source vertex ID of an edge. The data type of the result is the same as the vertex ID. dst(<edge>) Returns the destination vertex ID of an edge. The data type of the result is the same as the vertex ID. vertex startNode(<path>) Visits an edge or a path and returns its source vertex ID. string endNode(<path>) Visits an edge or a path and returns its destination vertex ID. int rank(<edge>) Returns the rank value of an edge.List functions
Function Description keys(expr) Returns a list containing the string representations for all the property names of vertices, edges, or maps. labels(vertex) Returns the list containing all the tags of a vertex. nodes(path) Returns the list containing all the vertices in a path. range(start, end [, step]) Returns the list containing all the fixed-length steps in[start,end]
. step
is 1 by default. relationships(path) Returns the list containing all the relationships in a path. reverse(list) Returns the list reversing the order of all elements in the original list. tail(list) Returns all the elements of the original list, excluding the first one. head(list) Returns the first element of a list. last(list) Returns the last element of a list. reduce() The reduce()
function applies an expression to each element in a list one by one, chains the result to the next iteration by taking it as the initial value, and returns the final result. Type conversion functions
Function Description bool toBoolean() Converts a string value to a boolean value. float toFloat() Converts an integer or string value to a floating point number. string toString() Converts non-compound types of data, such as numbers, booleans, and so on, to strings. int toInteger() Converts a floating point or string value to an integer value. set toSet() Converts a list or set value to a set value. int hash() Thehash()
function returns the hash value of the argument. The argument can be a number, a string, a list, a boolean, null, or an expression that evaluates to a value of the preceding data types. Predicate functions
Predicate functions return true
or false
. They are most commonly used in WHERE
clauses.
<predicate>(<variable> IN <list> WHERE <condition>)\n
Function Description exists() Returns true
if the specified property exists in the vertex, edge or map. Otherwise, returns false
. any() Returns true
if the specified predicate holds for at least one element in the given list. Otherwise, returns false
. all() Returns true
if the specified predicate holds for all elements in the given list. Otherwise, returns false
. none() Returns true
if the specified predicate holds for no element in the given list. Otherwise, returns false
. single() Returns true
if the specified predicate holds for exactly one of the elements in the given list. Otherwise, returns false
. Conditional expressions functions
Function Description CASE TheCASE
expression uses conditions to filter the result of an nGQL query statement. It is usually used in the YIELD
and RETURN
clauses. The CASE
expression will traverse all the conditions. When the first condition is met, the CASE
expression stops reading the conditions and returns the result. If no conditions are met, it returns the result in the ELSE
clause. If there is no ELSE
clause and no conditions are met, it returns NULL
. coalesce() Returns the first not null value in all expressions. MATCH
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
Pattern Example Description Match vertices (v)
You can use a user-defined variable in a pair of parentheses to represent a vertex in a pattern. For example: (v)
. Match tags MATCH (v:player) RETURN v
You can specify a tag with :<tag_name>
after the vertex in a pattern. Match multiple tags MATCH (v:player:team) RETURN v
To match vertices with multiple tags, use colons (:). Match vertex properties MATCH (v:player{name:\"Tim Duncan\"}) RETURN v
MATCH (v) WITH v, properties(v) as props, keys(properties(v)) as kk WHERE [i in kk where props[i] == \"Tim Duncan\"] RETURN v
You can specify a vertex property with {<prop_name>: <prop_value>}
after the tag in a pattern; or use a vertex property value to get vertices directly. Match a VID. MATCH (v) WHERE id(v) == 'player101' RETURN v
You can use the VID to match a vertex. The id()
function can retrieve the VID of a vertex. Match multiple VIDs. MATCH (v:player { name: 'Tim Duncan' })--(v2) WHERE id(v2) IN [\"player101\", \"player102\"] RETURN v2
To match multiple VIDs, use WHERE id(v) IN [vid_list]
. Match connected vertices MATCH (v:player{name:\"Tim Duncan\"})--(v2) RETURN v2.player.name AS Name
You can use the --
symbol to represent edges of both directions and match vertices connected by these edges. You can add a >
or <
to the --
symbol to specify the direction of an edge. Match paths MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) RETURN p
Connected vertices and edges form a path. You can use a user-defined variable to name a path as follows. Match edges MATCH (v:player{name:\"Tim Duncan\"})-[e]-(v2) RETURN e
MATCH ()<-[e]-() RETURN e
Besides using --
, -->
, or <--
to indicate a nameless edge, you can use a user-defined variable in a pair of square brackets to represent a named edge. For example: -[e]-
. Match an edge type MATCH ()-[e:follow]-() RETURN e
Just like vertices, you can specify an edge type with :<edge_type>
in a pattern. For example: -[e:follow]-
. Match edge type properties MATCH (v:player{name:\"Tim Duncan\"})-[e:follow{degree:95}]->(v2) RETURN e
MATCH ()-[e]->() WITH e, properties(e) as props, keys(properties(e)) as kk WHERE [i in kk where props[i] == 90] RETURN e
You can specify edge type properties with {<prop_name>: <prop_value>}
in a pattern. For example: [e:follow{likeness:95}]
; or use an edge type property value to get edges directly. Match multiple edge types MATCH (v:player{name:\"Tim Duncan\"})-[e:follow | :serve]->(v2) RETURN e
The |
symbol can help matching multiple edge types. For example: [e:follow|:serve]
. The English colon (:) before the first edge type cannot be omitted, but the English colon before the subsequent edge type can be omitted, such as [e:follow|serve]
. Match multiple edges MATCH (v:player{name:\"Tim Duncan\"})-[]->(v2)<-[e:serve]-(v3) RETURN v2, v3
You can extend a pattern to match multiple edges in a path. Match fixed-length paths MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) RETURN DISTINCT v2 AS Friends
You can use the :<edge_type>*<hop>
pattern to match a fixed-length path. hop
must be a non-negative integer. The data type of e
is the list. Match variable-length paths MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2) RETURN v2 AS Friends
minHop
: Optional. It represents the minimum length of the path. minHop
: must be a non-negative integer. The default value is 1.minHop
and maxHop
are optional and the default value is 1 and infinity respectively. The data type of e
is the list. Match variable-length paths with multiple edge types MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow | serve*2]->(v2) RETURN DISTINCT v2
You can specify multiple edge types in a fixed-length or variable-length pattern. In this case, hop
, minHop
, and maxHop
take effect on all edge types. The data type of e
is the list. Retrieve vertex or edge information MATCH (v:player{name:\"Tim Duncan\"}) RETURN v
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) RETURN e
Use RETURN {<vertex_name> | <edge_name>}
to retrieve all the information of a vertex or an edge. Retrieve VIDs MATCH (v:player{name:\"Tim Duncan\"}) RETURN id(v)
Use the id()
function to retrieve VIDs. Retrieve tags MATCH (v:player{name:\"Tim Duncan\"}) RETURN labels(v)
Use the labels()
function to retrieve the list of tags on a vertex.To retrieve the nth element in the labels(v)
list, use labels(v)[n-1]
. Retrieve a single property on a vertex or an edge MATCH (v:player{name:\"Tim Duncan\"}) RETURN v.player.age
Use RETURN {<vertex_name> | <edge_name>}.<property>
to retrieve a single property.Use AS
to specify an alias for a property. Retrieve all properties on a vertex or an edge MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN properties(v2)
Use the properties()
function to retrieve all properties on a vertex or an edge. Retrieve edge types MATCH p=(v:player{name:\"Tim Duncan\"})-[e]->() RETURN DISTINCT type(e)
Use the type()
function to retrieve the matched edge types. Retrieve paths MATCH p=(v:player{name:\"Tim Duncan\"})-[*3]->() RETURN p
Use RETURN <path_name>
to retrieve all the information of the matched paths. Retrieve vertices in a path MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN nodes(p)
Use the nodes()
function to retrieve all vertices in a path. Retrieve edges in a path MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN relationships(p)
Use the relationships()
function to retrieve all edges in a path. Retrieve path length MATCH p=(v:player{name:\"Tim Duncan\"})-[*..2]->(v2) RETURN p AS Paths, length(p) AS Length
Use the length()
function to retrieve the length of a path. OPTIONAL MATCH
Pattern Example Description Matches patterns against your graph database, just likeMATCH
does. MATCH (m)-[]->(n) WHERE id(m)==\"player100\" OPTIONAL MATCH (n)-[]->(l) RETURN id(m),id(n),id(l)
If no matches are found, OPTIONAL MATCH
will use a null for missing parts of the pattern. LOOKUP
LOOKUP ON {<vertex_tag> | <edge_type>} \n[WHERE <expression> [AND <expression> ...]] \nYIELD <return_list> [AS <alias>]\n
Pattern Example Description Retrieve vertices LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD player.name AS name, player.age AS age
The following example returns vertices whose name
is Tony Parker
and the tag is player
. Retrieve edges LOOKUP ON follow WHERE follow.degree == 90 YIELD follow.degree
Returns edges whose degree
is 90
and the edge type is follow
. List vertices with a tag LOOKUP ON player YIELD properties(vertex),id(vertex)
Shows how to retrieve the VID of all vertices tagged with player
. List edges with an edge types LOOKUP ON follow YIELD edge AS e
Shows how to retrieve the source Vertex IDs, destination vertex IDs, and ranks of all edges of the follow
edge type. Count the numbers of vertices or edges LOOKUP ON player YIELD id(vertex)| YIELD COUNT(*) AS Player_Count
Shows how to count the number of vertices tagged with player
. Count the numbers of edges LOOKUP ON follow YIELD edge as e| YIELD COUNT(*) AS Like_Count
Shows how to count the number of edges of the follow
edge type. GO
GO [[<M> TO] <N> {STEP|STEPS} ] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions> ]\nYIELD [DISTINCT] <return_list>\n[{SAMPLE <sample_list> | LIMIT <limit_list>}]\n[| GROUP BY {col_name | expr | position} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset_value>,] <number_rows>]\n
Example Description GO FROM \"player102\" OVER serve YIELD dst(edge)
Returns the teams that player 102 serves. GO 2 STEPS FROM \"player102\" OVER follow YIELD dst(edge)
Returns the friends of player 102 with 2 hops. GO FROM \"player100\", \"player102\" OVER serve WHERE properties(edge).start_year > 1995 YIELD DISTINCT properties($$).name AS team_name, properties(edge).start_year AS start_year, properties($^).name AS player_name
Adds a filter for the traversal. GO FROM \"player100\" OVER follow, serve YIELD properties(edge).degree, properties(edge).start_year
The following example traverses along with multiple edge types. If there is no value for a property, the output is NULL
. GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS destination
The following example returns the neighbor vertices in the incoming direction of player 100. GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id | GO FROM $-.id OVER serve WHERE properties($^).age > 20 YIELD properties($^).name AS FriendOf, properties($$).name AS Team
The following example retrieves the friends of player 100 and the teams that they serve. GO FROM \"player102\" OVER follow YIELD dst(edge) AS both
The following example returns all the neighbor vertices of player 102. GO 2 STEPS FROM \"player100\" OVER follow YIELD src(edge) AS src, dst(edge) AS dst, properties($$).age AS age | GROUP BY $-.dst YIELD $-.dst AS dst, collect_set($-.src) AS src, collect($-.age) AS age
The following example the outputs according to age. FETCH
Fetch vertex properties
FETCH PROP ON {<tag_name>[, tag_name ...] | *} \n<vid> [, vid ...] \nYIELD <return_list> [AS <alias>]\n
Example Description FETCH PROP ON player \"player100\" YIELD properties(vertex)
Specify a tag in the FETCH
statement to fetch the vertex properties by that tag. FETCH PROP ON player \"player100\" YIELD player.name AS name
Use a YIELD
clause to specify the properties to be returned. FETCH PROP ON player \"player101\", \"player102\", \"player103\" YIELD properties(vertex)
Specify multiple VIDs (vertex IDs) to fetch properties of multiple vertices. Separate the VIDs with commas. FETCH PROP ON player, t1 \"player100\", \"player103\" YIELD properties(vertex)
Specify multiple tags in the FETCH
statement to fetch the vertex properties by the tags. Separate the tags with commas. FETCH PROP ON * \"player100\", \"player106\", \"team200\" YIELD properties(vertex)
Set an asterisk symbol *
to fetch properties by all tags in the current graph space. Fetch edge properties
FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
Example Description FETCH PROP ON serve \"player100\" -> \"team204\" YIELD properties(edge)
The following statement fetches all the properties of the serve
edge that connects vertex \"player100\"
and vertex \"team204\"
. FETCH PROP ON serve \"player100\" -> \"team204\" YIELD serve.start_year
Use a YIELD
clause to fetch specific properties of an edge. FETCH PROP ON serve \"player100\" -> \"team204\", \"player133\" -> \"team202\" YIELD properties(edge)
Specify multiple edge patterns (<src_vid> -> <dst_vid>[@<rank>]
) to fetch properties of multiple edges. Separate the edge patterns with commas. FETCH PROP ON serve \"player100\" -> \"team204\"@1 YIELD properties(edge)
To fetch on an edge whose rank is not 0, set its rank in the FETCH statement. GO FROM \"player101\" OVER follow YIELD follow._src AS s, follow._dst AS d | FETCH PROP ON follow $-.s -> $-.d YIELD follow.degree
The following statement returns the degree
values of the follow
edges that start from vertex \"player101\"
. $var = GO FROM \"player101\" OVER follow YIELD follow._src AS s, follow._dst AS d; FETCH PROP ON follow $var.s -> $var.d YIELD follow.degree
You can use user-defined variables to construct similar queries. SHOW
Statement Syntax Example Description SHOW CHARSETSHOW CHARSET
SHOW CHARSET
Shows the available character sets. SHOW COLLATION SHOW COLLATION
SHOW COLLATION
Shows the collations supported by NebulaGraph. SHOW CREATE SPACE SHOW CREATE SPACE <space_name>
SHOW CREATE SPACE basketballplayer
Shows the creating statement of the specified graph space. SHOW CREATE TAG/EDGE SHOW CREATE {TAG <tag_name> | EDGE <edge_name>}
SHOW CREATE TAG player
Shows the basic information of the specified tag. SHOW HOSTS SHOW HOSTS [GRAPH | STORAGE | META]
SHOW HOSTS
SHOW HOSTS GRAPH
Shows the host and version information of Graph Service, Storage Service, and Meta Service. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the names of existing native indexes. SHOW PARTS SHOW PARTS [<part_id>]
SHOW PARTS
Shows the information of a specified partition or all partitions in a graph space. SHOW ROLES SHOW ROLES IN <space_name>
SHOW ROLES in basketballplayer
Shows the roles that are assigned to a user account. SHOW SNAPSHOTS SHOW SNAPSHOTS
SHOW SNAPSHOTS
Shows the information of all the snapshots. SHOW SPACES SHOW SPACES
SHOW SPACES
Shows existing graph spaces in NebulaGraph. SHOW STATS SHOW STATS
SHOW STATS
Shows the statistics of the graph space collected by the latest STATS
job. SHOW TAGS/EDGES SHOW TAGS | EDGES
SHOW TAGS
,SHOW EDGES
Shows all the tags in the current graph space. SHOW USERS SHOW USERS
SHOW USERS
Shows the user information. SHOW SESSIONS SHOW SESSIONS
SHOW SESSIONS
Shows the information of all the sessions. SHOW SESSIONS SHOW SESSION <Session_Id>
SHOW SESSION 1623304491050858
Shows a specified session with its ID. SHOW QUERIES SHOW [ALL] QUERIES
SHOW QUERIES
Shows the information of working queries in the current session. SHOW META LEADER SHOW META LEADER
SHOW META LEADER
Shows the information of the leader in the current Meta cluster. GROUP BY <var> YIELD <var>, <aggregation_function(var)>
GO FROM \"player100\" OVER follow BIDIRECT YIELD $$.player.name as Name | GROUP BY $-.Name YIELD $-.Name as Player, count(*) AS Name_Count
Finds all the vertices connected directly to vertex \"player100\"
, groups the result set by player names, and counts how many times the name shows up in the result set. LIMIT YIELD <var> [| LIMIT [<offset_value>,] <number_rows>]
GO FROM \"player100\" OVER follow REVERSELY YIELD $$.player.name AS Friend, $$.player.age AS Age | ORDER BY $-.Age, $-.Friend | LIMIT 1, 3
Returns the 3 rows of data starting from the second row of the sorted output. SKIP RETURN <var> [SKIP <offset>] [LIMIT <number_rows>]
MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) RETURN v2.player.name AS Name, v2.player.age AS Age ORDER BY Age DESC SKIP 1
SKIP
can be used alone to set the offset and return the data after the specified position. SAMPLE <go_statement> SAMPLE <sample_list>;
GO 3 STEPS FROM \"player100\" OVER * YIELD properties($$).name AS NAME, properties($$).age AS Age SAMPLE [1,2,3];
Takes samples evenly in the result set and returns the specified amount of data. ORDER BY <YIELD clause> ORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...]
FETCH PROP ON player \"player100\", \"player101\", \"player102\", \"player103\" YIELD player.age AS age, player.name AS name | ORDER BY $-.age ASC, $-.name DESC
The ORDER BY
clause specifies the order of the rows in the output. RETURN RETURN {<vertex_name>|<edge_name>|<vertex_name>.<property>|<edge_name>.<property>|...}
MATCH (v:player) RETURN v.player.name, v.player.age LIMIT 3
Returns the first three rows with values of the vertex properties name
and age
. TTL CREATE TAG <tag_name>(<property_name_1> <property_value_1>, <property_name_2> <property_value_2>, ...) ttl_duration= <value_int>, ttl_col = <property_name>
CREATE TAG t2(a int, b int, c string) ttl_duration= 100, ttl_col = \"a\"
Create a tag and set the TTL options. WHERE WHERE {<vertex|edge_alias>.<property_name> {>|==|<|...} <value>...}
MATCH (v:player) WHERE v.player.name == \"Tim Duncan\" XOR (v.player.age < 30 AND v.player.name == \"Yao Ming\") OR NOT (v.player.name == \"Yao Ming\" OR v.player.name == \"Tim Duncan\") RETURN v.player.name, v.player.age
The WHERE
clause filters the output by conditions. The WHERE
clause usually works in Native nGQL GO
and LOOKUP
statements, and OpenCypher MATCH
and WITH
statements. YIELD YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...] [WHERE <conditions>];
GO FROM \"player100\" OVER follow YIELD dst(edge) AS ID | FETCH PROP ON player $-.ID YIELD player.age AS Age | YIELD AVG($-.Age) as Avg_age, count(*)as Num_friends
Finds the players that \"player100\" follows and calculates their average age. WITH MATCH $expressions WITH {nodes()|labels()|...}
MATCH p=(v:player{name:\"Tim Duncan\"})--() WITH nodes(p) AS n UNWIND n AS n1 RETURN DISTINCT n1
The WITH
clause can retrieve the output from a query part, process it, and pass it to the next query part as the input. UNWIND UNWIND <list> AS <alias> <RETURN clause>
UNWIND [1,2,3] AS n RETURN n
Splits a list into rows."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#space_statements","title":"Space statements","text":"Statement Syntax Example Description CREATE SPACE CREATE SPACE [IF NOT EXISTS] <graph_space_name> ( [partition_num = <partition_number>,] [replica_factor = <replica_number>,] vid_type = {FIXED_STRING(<N>) | INT[64]} ) [COMMENT = '<comment>']
CREATE SPACE my_space_1 (vid_type=FIXED_STRING(30))
Creates a graph space with CREATE SPACE CREATE SPACE <new_graph_space_name> AS <old_graph_space_name>
CREATE SPACE my_space_4 as my_space_3
Clone a graph. space. USE USE <graph_space_name>
USE space1
Specifies a graph space as the current working graph space for subsequent queries. SHOW SPACES SHOW SPACES
SHOW SPACES
Lists all the graph spaces in the NebulaGraph examples. DESCRIBE SPACE DESC[RIBE] SPACE <graph_space_name>
DESCRIBE SPACE basketballplayer
Returns the information about the specified graph space. CLEAR SPACE CLEAR SPACE [IF EXISTS] <graph_space_name>
Deletes the vertices and edges in a graph space, but does not delete the graph space itself and the schema information. DROP SPACE DROP SPACE [IF EXISTS] <graph_space_name>
DROP SPACE basketballplayer
Deletes everything in the specified graph space."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#tag_statements","title":"TAG statements","text":"Statement Syntax Example Description CREATE TAG CREATE TAG [IF NOT EXISTS] <tag_name> ( <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'] [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] ) [TTL_DURATION = <ttl_duration>] [TTL_COL = <prop_name>] [COMMENT = '<comment>']
CREATE TAG woman(name string, age int, married bool, salary double, create_time timestamp) TTL_DURATION = 100, TTL_COL = \"create_time\"
Creates a tag with the given name in a graph space. DROP TAG DROP TAG [IF EXISTS] <tag_name>
DROP TAG test;
Drops a tag with the given name in the current working graph space. ALTER TAG ALTER TAG <tag_name> <alter_definition> [, alter_definition] ...] [ttl_definition [, ttl_definition] ... ] [COMMENT = '<comment>']
ALTER TAG t1 ADD (p3 int, p4 string)
Alters the structure of a tag with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration. SHOW TAGS SHOW TAGS
SHOW TAGS
Shows the name of all tags in the current graph space. DESCRIBE TAG DESC[RIBE] TAG <tag_name>
DESCRIBE TAG player
Returns the information about a tag with the given name in a graph space, such as field names, data type, and so on. DELETE TAG DELETE TAG <tag_name_list> FROM <VID>
DELETE TAG test1 FROM \"test\"
Deletes a tag with the given name on a specified vertex."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#edge_type_statements","title":"Edge type statements","text":"Statement Syntax Example Description CREATE EDGE CREATE EDGE [IF NOT EXISTS] <edge_type_name> ( <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'] [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] ) [TTL_DURATION = <ttl_duration>] [TTL_COL = <prop_name>] [COMMENT = '<comment>']
CREATE EDGE e1(p1 string, p2 int, p3 timestamp) TTL_DURATION = 100, TTL_COL = \"p2\"
Creates an edge type with the given name in a graph space. DROP EDGE DROP EDGE [IF EXISTS] <edge_type_name>
DROP EDGE e1
Drops an edge type with the given name in a graph space. ALTER EDGE ALTER EDGE <edge_type_name> <alter_definition> [, alter_definition] ...] [ttl_definition [, ttl_definition] ... ] [COMMENT = '<comment>']
ALTER EDGE e1 ADD (p3 int, p4 string)
Alters the structure of an edge type with the given name in a graph space. SHOW EDGES SHOW EDGES
SHOW EDGES
Shows all edge types in the current graph space. DESCRIBE EDGE DESC[RIBE] EDGE <edge_type_name>
DESCRIBE EDGE follow
Returns the information about an edge type with the given name in a graph space, such as field names, data type, and so on."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#vertex_statements","title":"Vertex statements","text":"Statement Syntax Example Description INSERT VERTEX INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...] VALUES <vid>: ([prop_value_list])
INSERT VERTEX t2 (name, age) VALUES \"13\":(\"n3\", 12), \"14\":(\"n4\", 8)
Inserts one or more vertices into a graph space in NebulaGraph. DELETE VERTEX DELETE VERTEX <vid> [, <vid> ...]
DELETE VERTEX \"team1\"
Deletes vertices and the related incoming and outgoing edges of the vertices. UPDATE VERTEX UPDATE VERTEX ON <tag_name> <vid> SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPDATE VERTEX ON player \"player101\" SET age = age + 2
Updates properties on tags of a vertex. UPSERT VERTEX UPSERT VERTEX ON <tag> <vid> SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPSERT VERTEX ON player \"player667\" SET age = 31
The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT VERTEX
to update the properties of a vertex if it exists or insert a new vertex if it does not exist."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#edge_statements","title":"Edge statements","text":"Statement Syntax Example Description INSERT EDGE INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ) [, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...]
INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1)
Inserts an edge or multiple edges into a graph space from a source vertex (given by src_vid) to a destination vertex (given by dst_vid) with a specific rank in NebulaGraph. DELETE EDGE DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid>[@<rank>] ...]
DELETE EDGE serve \"player100\" -> \"team204\"@0
Deletes one edge or multiple edges at a time. UPDATE EDGE UPDATE EDGE ON <edge_type> <src_vid> -> <dst_vid> [@<rank>] SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPDATE EDGE ON serve \"player100\" -> \"team204\"@0 SET start_year = start_year + 1
Updates properties on an edge. UPSERT EDGE UPSERT EDGE ON <edge_type> <src_vid> -> <dst_vid> [@rank] SET <update_prop> [WHEN <condition>] [YIELD <properties>]
UPSERT EDGE on serve \"player666\" -> \"team200\"@0 SET end_year = 2021
The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT EDGE
to update the properties of an edge if it exists or insert a new edge if it does not exist."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#index","title":"Index","text":"Native index
You can use native indexes together with LOOKUP
and MATCH
statements.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT = '<comment>']
CREATE TAG INDEX player_index on player()
Add native indexes for the existing tags, edge types, or properties. SHOW CREATE INDEX SHOW CREATE {TAG | EDGE} INDEX <index_name>
show create tag index index_2
Shows the statement used when creating a tag or an edge type. It contains detailed information about the index, such as its associated properties. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the defined tag or edge type indexes names in the current graph space. DESCRIBE INDEX DESCRIBE {TAG | EDGE} INDEX <index_name>
DESCRIBE TAG INDEX player_index_0
Gets the information about the index with a given name, including the property name (Field) and the property type (Type) of the index. REBUILD INDEX REBUILD {TAG | EDGE} INDEX [<index_name_list>]
REBUILD TAG INDEX single_person_index
Rebuilds the created tag or edge type index. If data is updated or inserted before the creation of the index, you must rebuild the indexes manually to make sure that the indexes contain the previously added data. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Returns the name of the created tag or edge type index and its status. DROP INDEX DROP {TAG | EDGE} INDEX [IF EXISTS] <index_name>
DROP TAG INDEX player_index_0
Removes an existing index from the current graph space. Full-text index
Syntax Example DescriptionSIGN IN TEXT SERVICE [(<elastic_ip:port> [,<username>, <password>]), (<elastic_ip:port>), ...]
SIGN IN TEXT SERVICE (127.0.0.1:9200)
The full-text indexes is implemented based on Elasticsearch. After deploying an Elasticsearch cluster, you can use the SIGN IN
statement to log in to the Elasticsearch client. SHOW TEXT SEARCH CLIENTS
SHOW TEXT SEARCH CLIENTS
Shows text search clients. SIGN OUT TEXT SERVICE
SIGN OUT TEXT SERVICE
Signs out to the text search clients. CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER=\"<analyzer_name>\"]
CREATE FULLTEXT TAG INDEX nebula_index_1 ON player(name)
Creates full-text indexes. SHOW FULLTEXT INDEXES
SHOW FULLTEXT INDEXES
Show full-text indexes. REBUILD FULLTEXT INDEX
REBUILD FULLTEXT INDEX
Rebuild full-text indexes. DROP FULLTEXT INDEX <index_name>
DROP FULLTEXT INDEX nebula_index_1
Drop full-text indexes. LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, \"<text>\") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>]
LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Chris\") YIELD id(vertex)
Use query options. GET SUBGRAPH [WITH PROP] [<step_count> {STEP|STEPS}] FROM {<vid>, <vid>...} [{IN | OUT | BOTH} <edge_type>, <edge_type>...] YIELD [VERTICES AS <vertex_alias>] [,EDGES AS <edge_alias>]
GET SUBGRAPH 1 STEPS FROM \"player100\" YIELD VERTICES AS nodes, EDGES AS relationships
Retrieves information of vertices and edges reachable from the source vertices of the specified edge types and returns information of the subgraph. FIND PATH FIND { SHORTEST | ALL | NOLOOP } PATH [WITH PROP] FROM <vertex_id_list> TO <vertex_id_list> OVER <edge_type_list> [REVERSELY | BIDIRECT] [<WHERE clause>] [UPTO <N> {STEP|STEPS}] YIELD path as <alias> [| ORDER BY $-.path] [| LIMIT <M>]
FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path as p
Finds the paths between the selected source vertices and destination vertices. A returned path is like (<vertex_id>)-[:<edge_type_name>@<rank>]->(<vertex_id)
."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#query_tuning_statements","title":"Query tuning statements","text":"Type Syntax Example Description EXPLAIN EXPLAIN [format=\"row\" | \"dot\"] <your_nGQL_statement>
EXPLAIN format=\"row\" SHOW TAGS
EXPLAIN format=\"dot\" SHOW TAGS
Helps output the execution plan of an nGQL statement without executing the statement. PROFILE PROFILE [format=\"row\" | \"dot\"] <your_nGQL_statement>
PROFILE format=\"row\" SHOW TAGS
EXPLAIN format=\"dot\" SHOW TAGS
Executes the statement, then outputs the execution plan as well as the execution profile."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#operation_and_maintenance_statements","title":"Operation and maintenance statements","text":"SUBMIT JOB BALANCE
Syntax DescriptionBALANCE LEADER
Starts a job to balance the distribution of all the storage leaders in graph spaces. It returns the job ID. Job statements
Syntax DescriptionSUBMIT JOB COMPACT
Triggers the long-term RocksDB compact
operation. SUBMIT JOB FLUSH
Writes the RocksDB memfile in the memory to the hard disk. SUBMIT JOB STATS
Starts a job that makes the statistics of the current graph space. Once this job succeeds, you can use the SHOW STATS
statement to list the statistics. SHOW JOB <job_id>
Shows the information about a specific job and all its tasks in the current graph space. The Meta Service parses a SUBMIT JOB
request into multiple tasks and assigns them to the nebula-storaged processes. SHOW JOBS
Lists all the unexpired jobs in the current graph space. STOP JOB
Stops jobs that are not finished in the current graph space. RECOVER JOB
Re-executes the failed jobs in the current graph space and returns the number of recovered jobs. Kill queries
Syntax Example DescriptionKILL QUERY (session=<session_id>, plan=<plan_id>)
KILL QUERY(SESSION=1625553545984255,PLAN=163)
Terminates the query being executed, and is often used to terminate slow queries. Kill sessions
Syntax Example DescriptionKILL {SESSION|SESSIONS} <SessionId>
KILL SESSION 1672887983842984
Terminates a single session. SHOW SESSIONS | YIELD $-.SessionId AS sid [WHERE <filter_clause>] | KILL {SESSION|SESSIONS} $-.sid
SHOW SESSIONS | YIELD $-.SessionId AS sid, $-.CreateTime as CreateTime | ORDER BY $-.CreateTime ASC | LIMIT 2 | KILL SESSIONS $-.sid
Terminates multiple sessions based on specified criteria. SHOW SESSIONS | KILL SESSIONS $-.SessionId
SHOW SESSIONS | KILL SESSIONS $-.SessionId
Terminates all sessions. This topic lists the frequently asked questions for using NebulaGraph master. You can use the search box in the help center or the search function of the browser to match the questions you are looking for.
If the solutions described in this topic cannot solve your problems, ask for help on the NebulaGraph forum or submit an issue on GitHub issue.
"},{"location":"20.appendix/0.FAQ/#about_manual_updates","title":"About manual updates","text":""},{"location":"20.appendix/0.FAQ/#why_is_the_behavior_in_the_manual_not_consistent_with_the_system","title":"\"Why is the behavior in the manual not consistent with the system?\"","text":"NebulaGraph is still under development. Its behavior changes from time to time. Users can submit an issue to inform the team if the manual and the system are not consistent.
Note
If you find some errors in this topic:
pencil
button at the top right side of this page.Compatibility
Neubla Graph master is not compatible with NebulaGraph 1.x nor 2.0-RC in both data formats and RPC-protocols, and vice versa. The service process may quit if using an lower version client to connect to a higher version server.
To upgrade data formats, see Upgrade NebulaGraph to the current version. Users must upgrade all clients.
"},{"location":"20.appendix/0.FAQ/#about_execution_errors","title":"About execution errors","text":""},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_-1005graphmemoryexceeded_-2600","title":"\"How to resolve the error-1005:GraphMemoryExceeded: (-2600)
?\"","text":"This error is issued by the Memory Tracker when it observes that memory usage has exceeded a set threshold. This mechanism can help avoid service processes from being terminated by the system's OOM (Out of Memory) killer. Steps to resolve:
Check memory usage: First, you need to check the memory usage during the execution of the command. If the memory usage is indeed high, then this error might be expected.
Check the configuration of the Memory Tracker: If the memory usage is not high, check the relevant configurations of the Memory Tracker. These include memory_tracker_untracked_reserved_memory_mb
(untracked reserved memory in MB), memory_tracker_limit_ratio
(memory limit ratio), and memory_purge_enabled
(whether memory purge is enabled). For the configuration of the Memory Tracker, see memory tracker configuration.
Optimize configurations: Adjust these configurations according to the actual situation. For example, if the available memory limit is too low, you can increase the value of memory_tracker_limit_ratio
.
SemanticError: Missing yield clause.
?\"","text":"Starting with NebulaGraph 3.0.0, the statements LOOKUP
, GO
, and FETCH
must output results with the YIELD
clause. For more information, see YIELD.
Host not enough!
?\"","text":"From NebulaGraph version 3.0.0, the Storage services added in the configuration files CANNOT be read or written directly. The configuration files only register the Storage services into the Meta services. You must run the ADD HOSTS
command to read and write data on Storage servers. For more information, see Manage Storage hosts.
To get the property of the vertex in 'v.age', should use the format 'var.tag.prop'
?\"","text":"From NebulaGraph version 3.0.0, patterns support matching multiple tags at the same time, so you need to specify a tag name when querying properties. The original statement RETURN variable_name.property_name
is changed to RETURN variable_name.<tag_name>.property_name
.
Used memory hits the high watermark(0.800000) of total system memory.
?\"","text":"The error may be caused if the system memory usage is higher than the threshold specified bysystem_memory_high_watermark_ratio
, which defaults to 0.8
. When the threshold is exceeded, an alarm is triggered and NebulaGraph stops processing queries.
Possible solutions are as follows:
system_memory_high_watermark_ratio
parameter to the configuration files of all Graph servers, and set it greater than 0.8
, such as 0.9
.However, the system_memory_high_watermark_ratio
parameter is deprecated. It is recommended that you use the Memory Tracker feature instead to limit the memory usage of Graph and Storage services. For more information, see Memory Tracker for Graph service and Memory Tracker for Storage service.
Storage Error E_RPC_FAILURE
?\"","text":"The reason for this error is usually that the storaged process returns too many data back to the graphd process. Possible solutions are as follows:
--storage_client_timeout_ms
in the nebula-graphd.conf
file to extend the connection timeout of the Storage client. This configuration is measured in milliseconds (ms). For example, set --storage_client_timeout_ms=60000
. If this parameter is not specified in the nebula-graphd.conf
file, specify it manually. Tip: Add --local_config=true
at the beginning of the configuration file and restart the service.LIMIT
is used to limit the number of returned results, use the GO
statement to rewrite the MATCH
statement (the former is optimized, while the latter is not).dmesg |grep nebula
).The leader has changed. Try again later
?\"","text":"It is a known issue. Just retry 1 to N times, where N is the partition number. The reason is that the meta client needs some heartbeats to update or errors to trigger the new leader information.
If this error occurs when logging in to NebulaGraph, you can consider using df -h
to view the disk space and check whether the local disk is full.
Schema not exist: xxx
?\"","text":"If the system returns Schema not exist
when querying, make sure that:
Problem description: The system reports Could not find artifact com.vesoft:client:jar:xxx-SNAPSHOT
when compiling.
Cause: There is no local Maven repository for storing or downloading SNAPSHOT packages. The default central repository in Maven only stores official releases, not development versions (SNAPSHOTs).
Solution: Add the following configuration in the profiles
scope of Maven's setting.xml
file:
<profile>\n <activation>\n <activeByDefault>true</activeByDefault>\n </activation>\n <repositories>\n <repository>\n <id>snapshots</id>\n <url>https://oss.sonatype.org/content/repositories/snapshots/</url>\n <snapshots>\n <enabled>true</enabled>\n </snapshots>\n </repository>\n </repositories>\n </profile>\n
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_error_-1004_syntaxerror_syntax_error_near","title":"\"How to resolve [ERROR (-1004)]: SyntaxError: syntax error near
?\"","text":"In most cases, a query statement requires a YIELD
or a RETURN
. Check your query statement to see if YIELD
or RETURN
is provided.
can\u2019t solve the start vids from the sentence
?\"","text":"The graphd process requires start vids
to begin a graph traversal. The start vids
can be specified by the user. For example:
> GO FROM ${vids} ...\n> MATCH (src) WHERE id(src) == ${vids}\n# The \"start vids\" are explicitly given by ${vids}.\n
It can also be found from a property index. For example:
# CREATE TAG INDEX IF NOT EXISTS i_player ON player(name(20));\n# REBUILD TAG INDEX i_player;\n\n> LOOKUP ON player WHERE player.name == \"abc\" | ... YIELD ...\n> MATCH (src) WHERE src.name == \"abc\" ...\n# The \"start vids\" are found from the property index \"name\".\n
Otherwise, an error like can\u2019t solve the start vids from the sentence
will be returned.
Wrong vertex id type: 1001
?\"","text":"Check whether the VID is INT64
or FIXED_STRING(N)
set by create space
. For more information, see create space.
The VID must be a 64-bit integer or a string fitting space vertex id length limit.
?\"","text":"Check whether the length of the VID exceeds the limitation. For more information, see create space.
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_edge_conflict_or_vertex_conflict","title":"\"How to resolve the erroredge conflict
or vertex conflict
?\"","text":"NebulaGraph may return such errors when the Storage service receives multiple requests to insert or update the same vertex or edge within milliseconds. Try the failed requests again later.
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_rpc_failure_in_metaclient_connection_refused","title":"\"How to resolve the errorRPC failure in MetaClient: Connection refused
?\"","text":"The reason for this error is usually that the metad service status is unusual, or the network of the machine where the metad and graphd services are located is disconnected. Possible solutions are as follows:
telnet meta-ip:port
to check the network status under the server that returns an error.StorageClientBase.inl:214] Request to \"x.x.x.x\":9779 failed: N6apache6thrift9transport19TTransportExceptionE: Timed Out
in nebula-graph.INFO
?\"","text":"The reason for this error may be that the amount of data to be queried is too large, and the storaged process has timed out. Possible solutions are as follows:
--storage_client_timeout_ms
in the nebula-graphd.conf
file. This configuration is measured in milliseconds (ms). The default value is 60000ms.MetaClient.cpp:65] Heartbeat failed, status:Wrong cluster!
in nebula-storaged.INFO
, or HBProcessor.cpp:54] Reject wrong cluster host \"x.x.x.x\":9771!
in nebula-metad.INFO
?\"","text":"The reason for this error may be that the user has modified the IP or the port information of the metad process, or the storage service has joined other clusters before. Possible solutions are as follows:
Delete the cluster.id
file in the installation directory where the storage machine is deployed (the default installation directory is /usr/local/nebula
), and restart the storaged service.
Storage Error: More than one request trying to add/update/delete one edge/vertex at he same time.
?\"","text":"The reason for this error is that the current NebulaGraph version does not support concurrent requests to the same vertex or edge at the same time. To solve this error, re-execute your commands.
"},{"location":"20.appendix/0.FAQ/#about_design_and_functions","title":"About design and functions","text":""},{"location":"20.appendix/0.FAQ/#how_is_the_time_spent_value_at_the_end_of_each_return_message_calculated","title":"\"How is thetime spent
value at the end of each return message calculated?\"","text":"Take the returned message of SHOW SPACES
as an example:
nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"basketballplayer\" |\n+--------------------+\nGot 1 rows (time spent 1235/1934 us)\n
1235
shows the time spent by the database itself, that is, the time it takes for the query engine to receive a query from the client, fetch the data from the storage server, and perform a series of calculations.1934
shows the time spent from the client's perspective, that is, the time it takes for the client from sending a request, receiving a response, and displaying the result on the screen.nebula-storaged
process keep showing red after connecting to NebulaGraph?\"","text":"Because the nebula-storaged
process waits for nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
This is caused by the release of NebulaGraph Console 2.6.0, not the change of NebulaGraph core. And it will not affect the content of the returned data itself.
"},{"location":"20.appendix/0.FAQ/#about_dangling_edges","title":"About dangling edges","text":"A dangling edge is an edge that only connects to a single vertex and only one part of the edge connects to the vertex.
Dangling edges may appear in NebulaGraph master as the design. And there is no MERGE
statements of openCypher. The guarantee for dangling edges depends entirely on the application level. For more information, see INSERT VERTEX, DELETE VERTEX, INSERT EDGE, DELETE EDGE.
replica_factor
as an even number in CREATE SPACE
statements, e.g., replica_factor = 2
?\"","text":"NO.
The Storage service guarantees its availability based on the Raft consensus protocol. The number of failed replicas must not exceed half of the total replica number.
When the number of machines is 1, replica_factor
can only be set to1
.
When there are enough machines and replica_factor=2
, if one replica fails, the Storage service fails. No matter replica_factor=3
or replica_factor=4
, if more than one replica fails, the Storage Service fails. To prevent unnecessary waste of resources, we recommend that you set an odd replica number.
We suggest that you set replica_factor=3
for a production environment and replica_factor=1
for a test environment. Do not use an even number.
Yes. For more information, see Kill query.
"},{"location":"20.appendix/0.FAQ/#why_are_the_query_results_different_when_using_go_and_match_to_execute_the_same_semantic_query","title":"\"Why are the query results different when usingGO
and MATCH
to execute the same semantic query?\"","text":"The possible reasons are listed as follows.
GO
statements find the dangling edges.RETURN
commands do not specify the sequence.max_edge_returned_per_vertex
in the Storage service is triggered.Using different types of paths may cause different query results.
GO
statements use walk
. Both vertices and edges can be repeatedly visited in graph traversal.MATCH
statements are compatible with openCypher and use trail
. Only vertices can be repeatedly visited in graph traversal.The example is as follows.
All queries that start from A
with 5 hops will end at C
(A->B->C->D->E->C
). If it is 6 hops, the GO
statement will end at D
(A->B->C->D->E->C->D
), because the edge C->D
can be visited repeatedly. However, the MATCH
statement returns empty, because edges cannot be visited repeatedly.
Therefore, using GO
and MATCH
to execute the same semantic query may cause different query results.
For more information, see Wikipedia.
"},{"location":"20.appendix/0.FAQ/#how_to_count_the_verticesedges_number_of_each_tagedge_type","title":"\"How to count the vertices/edges number of each tag/edge type?\"","text":"See show-stats.
"},{"location":"20.appendix/0.FAQ/#how_to_get_all_the_verticesedge_of_each_tagedge_type","title":"\"How to get all the vertices/edge of each tag/edge type?\"","text":"Create and rebuild the index.
> CREATE TAG INDEX IF NOT EXISTS i_player ON player();\n> REBUILD TAG INDEX IF NOT EXISTS i_player;\n
Use LOOKUP
or MATCH
. For example:
> LOOKUP ON player;\n> MATCH (n:player) RETURN n;\n
For more information, see INDEX
, LOOKUP
, and MATCH
.
Yes, for more information, see Keywords and reserved words.
"},{"location":"20.appendix/0.FAQ/#how_to_get_the_out-degreethe_in-degree_of_a_given_vertex","title":"\"How to get the out-degree/the in-degree of a given vertex?\"","text":"The out-degree of a vertex refers to the number of edges starting from that vertex, while the in-degree refers to the number of edges pointing to that vertex.
nebula > MATCH (s)-[e]->() WHERE id(s) == \"given\" RETURN count(e); #Out-degree\nnebula > MATCH (s)<-[e]-() WHERE id(s) == \"given\" RETURN count(e); #In-degree\n
This is a very slow operation to get the out/in degree since no accelaration can be applied (no indices or caches). It also could be out-of-memory when hitting a supper-node.
"},{"location":"20.appendix/0.FAQ/#how_to_quickly_get_the_out-degree_and_in-degree_of_all_vertices","title":"\"How to quickly get the out-degree and in-degree of all vertices?\"","text":"There is no such command.
You can use NebulaGraph Algorithm.
"},{"location":"20.appendix/0.FAQ/#about_operation_and_maintenance","title":"About operation and maintenance","text":""},{"location":"20.appendix/0.FAQ/#the_runtime_log_files_are_too_large_how_to_recycle_the_logs","title":"\"The runtime log files are too large. How to recycle the logs?\"","text":"NebulaGraph uses glog for log printing, which does not support log recycling. You can manage runtime logs by using cron jobs or the log management tool logrotate. For operational details, see Log recycling.
"},{"location":"20.appendix/0.FAQ/#how_to_check_the_nebulagraph_version","title":"\"How to check the NebulaGraph version?\"","text":"If the service is running: run command SHOW HOSTS META
in nebula-console
. See SHOW HOSTS.
If the service is not running:
Different installation methods make the method of checking the version different. The instructions are as follows:
If the service is not running, run the command ./<binary_name> --version
to get the version and the Git commit IDs of the NebulaGraph binary files. For example:
$ ./nebula-graphd --version\n
If you deploy NebulaGraph with Docker Compose
Check the version of NebulaGraph deployed by Docker Compose. The method is similar to the previous method, except that you have to enter the container first. The commands are as follows:
docker exec -it nebula-docker-compose_graphd_1 bash\ncd bin/\n./nebula-graphd --version\n
If you install NebulaGraph with RPM/DEB package
Run rpm -qa |grep nebula
to check the version of NebulaGraph.
Warning
The cluster scaling function has not been officially released in the community edition. The operations involving SUBMIT JOB BALANCE DATA REMOVE
and SUBMIT JOB BALANCE DATA
are experimental features in the community edition and the functionality is not stable. Before using it in the community edition, make sure to back up your data first and set enable_experimental_feature
and enable_data_balance
to true
in the Graph configuration file.
NebulaGraph master does not provide any commands or tools to support automatic scale out/in. You can refer to the following steps:
Scale out and scale in metad: The metad process can not be scaled out or scale in. The process cannot be moved to a new machine. You cannot add a new metad process to the service.
Note
You can use the Meta transfer script tool to migrate Meta services. Note that the Meta-related settings in the configuration files of Storage and Graph services need to be modified correspondingly.
Scale in storaged: See Balance remove command. After the command is finished, stop this storaged process.
Caution
Scale out storaged: Prepare the binary and config files of the storaged process in the new host, modify the config files and add all existing addresses of the metad processes. Then register the storaged process to the metad, and then start the new storaged process. For details, see Register storaged services.
You also need to run Balance Data and Balance leader after scaling in/out storaged.
Currently, Storage cannot dynamically recognize new added disks. You can add or remove disks in the Storage nodes of the distributed cluster by following these steps:
Execute SUBMIT JOB BALANCE DATA REMOVE <ip:port>
to migrate data in the Storage node with the disk to be added or removed to other Storage nodes.
Caution
Execute DROP HOSTS <ip:port>
to remove the Storage node with the disk to be added or removed.
In the configuration file of all Storage nodes, configure the path of the new disk to be added or removed through --data_path
, see Storage configuration file for details.
ADD HOSTS <ip:port>
to re-add the Storage node with the disk to be added or removed.SUBMIT JOB BALANCE DATA
to evenly distribute the shards of the current space to all Storage nodes and execute SUBMIT JOB BALANCE LEADER
command to balance the leaders in all spaces. Before running the command, select a space.OFFLINE
. What should I do?\"","text":"Hosts with the status of OFFLINE
will be automatically deleted after one day.
The dmp file is an error report file detailing the exit of the process and can be viewed with the gdb utility. the Coredump file is saved in the directory of the startup binary (by default it is /usr/local/nebula
) and is generated automatically when the NebulaGraph service crashes.
$ file core.<pid>\n
$ gdb <process.name> core.<pid>\n
$(gdb) bt\n
For example:
$ file core.1316027\ncore.1316027: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/home/workspace/fork/nebula-debug/bin/nebula-metad --flagfile /home/k', real uid: 1008, effective uid: 1008, real gid: 1008, effective gid: 1008, execfn: '/home/workspace/fork/nebula-debug/bin/nebula-metad', platform: 'x86_64'\n\n$ gdb /home/workspace/fork/nebula-debug/bin/nebula-metad core.1316027\n\n$(gdb) bt\n#0 0x00007f9de58fecf5 in __memcpy_ssse3_back () from /lib64/libc.so.6\n#1 0x0000000000eb2299 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) ()\n#2 0x0000000000ef71a7 in nebula::meta::cpp2::QueryDesc::QueryDesc(nebula::meta::cpp2::QueryDesc const&) ()\n...\n
If you are not clear about the information that dmp prints out, you can post the printout with the OS version, hardware configuration, error logs before and after the Core file was created and actions that may have caused the error on the NebulaGraph forum.
"},{"location":"20.appendix/0.FAQ/#how_can_i_set_the_nebulagraph_service_to_start_automatically_on_boot_via_systemctl","title":"How can I set the NebulaGraph service to start automatically on boot via systemctl?","text":"Execute systemctl enable
to start the metad, graphd and storaged services.
[root]# systemctl enable nebula-metad.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-metad.service to /usr/lib/systemd/system/nebula-metad.service.\n[root]# systemctl enable nebula-graphd.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-graphd.service to /usr/lib/systemd/system/nebula-graphd.service.\n[root]# systemctl enable nebula-storaged.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-storaged.service to /usr/lib/systemd/system/nebula-storaged.service.\n
Configure the service files for metad, graphd and storaged to set the service to pull up automatically.
Caution
The following points need to be noted when configuring the service file. - The paths of the PIDFile, ExecStart, ExecReload and ExecStop parameters need to be the same as those on the server. - RestartSec is the length of time (in seconds) to wait before restarting, which can be modified according to the actual situation. - (Optional) StartLimitInterval is the unlimited restart, the default is 10 seconds if the restart exceeds 5 times, and set to 0 means unlimited restart. - (Optional) LimitNOFILE is the maximum number of open files for the service, the default is 1024 and can be changed according to the actual situation.
Configure the service file for the metad service.
$ vi /usr/lib/systemd/system/nebula-metad.service\n\n[Unit]\nDescription=Nebula Graph Metad Service\nAfter=network.target\n\n[Service ]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-metad.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start metad\nExecReload=/usr/local/nebula/scripts/nebula.service restart metad\nExecStop=/usr/local/nebula/scripts/nebula.service stop metad\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Configure the service file for the graphd service.
$ vi /usr/lib/systemd/system/nebula-graphd.service\n[Unit]\nDescription=Nebula Graph Graphd Service\nAfter=network.target\n\n[Service]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-graphd.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start graphd\nExecReload=/usr/local/nebula/scripts/nebula.service restart graphd\nExecStop=/usr/local/nebula/scripts/nebula.service stop graphd\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Configure the service file for the storaged service. $ vi /usr/lib/systemd/system/nebula-storaged.service\n[Unit]\nDescription=Nebula Graph Storaged Service\nAfter=network.target\n\n[Service]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-storaged.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start storaged\nExecReload=/usr/local/nebula/scripts/nebula.service restart storaged\nExecStop=/usr/local/nebula/scripts/nebula.service stop storaged\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Reload the configuration file.
[root]# sudo systemctl daemon-reload\n
Restart the service.
$ systemctl restart nebula-metad.service\n$ systemctl restart nebula-graphd.service\n$ systemctl restart nebula-storaged.service\n
If you have not modified the predefined ports in the Configurations, open the following ports for the NebulaGraph services:
Service Port Meta 9559, 9560, 19559 Graph 9669, 19669 Storage 9777 ~ 9780, 19779If you have customized the configuration files and changed the predefined ports, find the port numbers in your configuration files and open them on the firewalls.
For more port information, see Port Guide for Company Products.
"},{"location":"20.appendix/0.FAQ/#how_to_test_whether_a_port_is_open_or_closed","title":"\"How to test whether a port is open or closed?\"","text":"You can use telnet as follows to check for port status.
telnet <ip> <port>\n
Note
If you cannot use the telnet command, check if telnet is installed or enabled on your host.
For example:
// If the port is open:\n$ telnet 192.168.1.10 9669\nTrying 192.168.1.10...\nConnected to 192.168.1.10.\nEscape character is '^]'.\n\n// If the port is closed or blocked:\n$ telnet 192.168.1.10 9777\nTrying 192.168.1.10...\ntelnet: connect to address 192.168.1.10: Connection refused\n
"},{"location":"20.appendix/6.eco-tool-version/","title":"Ecosystem tools overview","text":""},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_studio","title":"NebulaGraph Studio","text":"NebulaGraph Studio (Studio for short) is a graph database visualization tool that can be accessed through the Web. It can be used with NebulaGraph DBMS to provide one-stop services such as composition, data import, writing nGQL queries, and graph exploration. For details, see What is NebulaGraph Studio.
Note
The release of the Studio is independent of NebulaGraph core, and its naming method is also not the same as the core naming rules.
NebulaGraph version Studio version master v3.9.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_dashboard_community_edition","title":"NebulaGraph Dashboard Community Edition","text":"NebulaGraph Dashboard Community Edition (Dashboard for short) is a visualization tool for monitoring the status of machines and services in the NebulaGraph cluster. For details, see What is NebulaGraph Dashboard.
NebulaGraph version Dashboard Community version master v3.4.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_exchange","title":"NebulaGraph Exchange","text":"NebulaGraph Exchange (Exchange for short) is an Apache Spark&trade application for batch migration of data in a cluster to NebulaGraph in a distributed environment. It can support the migration of batch data and streaming data in a variety of different formats. For details, see What is NebulaGraph Exchange.
NebulaGraph version Exchange Community version master v3.7.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_operator","title":"NebulaGraph Operator","text":"NebulaGraph Operator (Operator for short) is a tool to automate the deployment, operation, and maintenance of NebulaGraph clusters on Kubernetes. Building upon the excellent scalability mechanism of Kubernetes, NebulaGraph introduced its operation and maintenance knowledge into the Kubernetes system, which makes NebulaGraph a real cloud-native graph database. For more information, see What is NebulaGraph Operator.
NebulaGraph version Operator version master v1.8.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_importer","title":"NebulaGraph Importer","text":"NebulaGraph Importer (Importer for short) is a CSV file import tool for NebulaGraph. The Importer can read the local CSV file, and then import the data into the NebulaGraph database. For details, see What is NebulaGraph Importer.
NebulaGraph version Importer version master v4.1.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_spark_connector","title":"NebulaGraph Spark Connector","text":"NebulaGraph Spark Connector is a Spark connector that provides the ability to read and write NebulaGraph data in the Spark standard format. NebulaGraph Spark Connector consists of two parts, Reader and Writer. For details, see What is NebulaGraph Spark Connector.
NebulaGraph version Spark Connector version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_flink_connector","title":"NebulaGraph Flink Connector","text":"NebulaGraph Flink Connector is a connector that helps Flink users quickly access NebulaGraph. It supports reading data from the NebulaGraph database or writing data read from other external data sources to the NebulaGraph database. For details, see What is NebulaGraph Flink Connector.
NebulaGraph version Flink Connector version master v3.5.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_algorithm","title":"NebulaGraph Algorithm","text":"NebulaGraph Algorithm (Algorithm for short) is a Spark application based on GraphX, which uses a complete algorithm tool to analyze data in the NebulaGraph database by submitting a Spark task To perform graph computing, use the algorithm under the lib repository through programming to perform graph computing for DataFrame. For details, see What is NebulaGraph Algorithm.
NebulaGraph version Algorithm version master v3.0.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_console","title":"NebulaGraph Console","text":"NebulaGraph Console is the native CLI client of NebulaGraph. For how to use it, see NebulaGraph Console.
NebulaGraph version Console version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_docker_compose","title":"NebulaGraph Docker Compose","text":"Docker Compose can quickly deploy NebulaGraph clusters. For how to use it, please refer to Docker Compose Deployment NebulaGraph.
NebulaGraph version Docker Compose version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#backup_restore","title":"Backup & Restore","text":"Backup&Restore (BR for short) is a command line interface (CLI) tool that can help back up the graph space data of NebulaGraph, or restore it through a backup file data.
NebulaGraph version BR version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_bench","title":"NebulaGraph Bench","text":"NebulaGraph Bench is used to test the baseline performance data of NebulaGraph. It uses the standard data set of LDBC.
NebulaGraph version Bench version master v1.2.0"},{"location":"20.appendix/6.eco-tool-version/#api_and_sdk","title":"API and SDK","text":"Compatibility
Select the latest version of X.Y.*
which is the same as the core version.
The following are useful utilities and tools contributed and maintained by community users.
NebulaGraph returns an error code when an error occurs. This topic describes the details of the error code returned.
Note
0
, it means that the operation is successful.E_DISCONNECTED
-1
Lost connection E_FAIL_TO_CONNECT
-2
Unable to establish connection E_RPC_FAILURE
-3
RPC failure E_LEADER_CHANGED
-4
Raft leader has been changed E_SPACE_NOT_FOUND
-5
Graph space does not exist E_TAG_NOT_FOUND
-6
Tag does not exist E_EDGE_NOT_FOUND
-7
Edge type does not exist E_INDEX_NOT_FOUND
-8
Index does not exist E_EDGE_PROP_NOT_FOUND
-9
Edge type property does not exist E_TAG_PROP_NOT_FOUND
-10
Tag property does not exist E_ROLE_NOT_FOUND
-11
The current role does not exist E_CONFIG_NOT_FOUND
-12
The current configuration does not exist E_MACHINE_NOT_FOUND
-13
The current host does not exist E_LISTENER_NOT_FOUND
-15
Listener does not exist E_PART_NOT_FOUND
-16
The current partition does not exist E_KEY_NOT_FOUND
-17
Key does not exist E_USER_NOT_FOUND
-18
User does not exist E_STATS_NOT_FOUND
-19
Statistics do not exist E_SERVICE_NOT_FOUND
-20
No current service found E_DRAINER_NOT_FOUND
-21
Drainer does not exist E_DRAINER_CLIENT_NOT_FOUND
-22
Drainer client does not exist E_PART_STOPPED
-23
The current partition has already been stopped E_BACKUP_FAILED
-24
Backup failed E_BACKUP_EMPTY_TABLE
-25
The backed-up table is empty E_BACKUP_TABLE_FAILED
-26
Table backup failure E_PARTIAL_RESULT
-27
MultiGet could not get all data E_REBUILD_INDEX_FAILED
-28
Index rebuild failed E_INVALID_PASSWORD
-29
Password is invalid E_FAILED_GET_ABS_PATH
-30
Unable to get absolute path E_BAD_USERNAME_PASSWORD
-1001
Authentication failed E_SESSION_INVALID
-1002
Invalid session E_SESSION_TIMEOUT
-1003
Session timeout E_SYNTAX_ERROR
-1004
Syntax error E_EXECUTION_ERROR
-1005
Execution error E_STATEMENT_EMPTY
-1006
Statement is empty E_BAD_PERMISSION
-1008
Permission denied E_SEMANTIC_ERROR
-1009
Semantic error E_TOO_MANY_CONNECTIONS
-1010
Maximum number of connections exceeded E_PARTIAL_SUCCEEDED
-1011
Access to storage failed (only some requests succeeded) E_NO_HOSTS
-2001
Host does not exist E_EXISTED
-2002
Host already exists E_INVALID_HOST
-2003
Invalid host E_UNSUPPORTED
-2004
The current command, statement, or function is not supported E_NOT_DROP
-2005
Not allowed to drop E_CONFIG_IMMUTABLE
-2007
Configuration items cannot be changed E_CONFLICT
-2008
Parameters conflict with meta data E_INVALID_PARM
-2009
Invalid parameter E_WRONGCLUSTER
-2010
Wrong cluster E_ZONE_NOT_ENOUGH
-2011
Listener conflicts E_ZONE_IS_EMPTY
-2012
Host not exist E_SCHEMA_NAME_EXISTS
-2013
Schema name already exists E_RELATED_INDEX_EXISTS
-2014
There are still indexes related to tag or edge, cannot drop it E_RELATED_SPACE_EXISTS
-2015
There are still some space on the host, cannot drop it E_STORE_FAILURE
-2021
Failed to store data E_STORE_SEGMENT_ILLEGAL
-2022
Illegal storage segment E_BAD_BALANCE_PLAN
-2023
Invalid data balancing plan E_BALANCED
-2024
The cluster is already in the data balancing status E_NO_RUNNING_BALANCE_PLAN
-2025
There is no running data balancing plan E_NO_VALID_HOST
-2026
Lack of valid hosts E_CORRUPTED_BALANCE_PLAN
-2027
A data balancing plan that has been corrupted E_IMPROPER_ROLE
-2030
Failed to recover user role E_INVALID_PARTITION_NUM
-2031
Number of invalid partitions E_INVALID_REPLICA_FACTOR
-2032
Invalid replica factor E_INVALID_CHARSET
-2033
Invalid character set E_INVALID_COLLATE
-2034
Invalid character sorting rules E_CHARSET_COLLATE_NOT_MATCH
-2035
Character set and character sorting rule mismatch E_SNAPSHOT_FAILURE
-2040
Failed to generate a snapshot E_BLOCK_WRITE_FAILURE
-2041
Failed to write block data E_ADD_JOB_FAILURE
-2044
Failed to add new task E_STOP_JOB_FAILURE
-2045
Failed to stop task E_SAVE_JOB_FAILURE
-2046
Failed to save task information E_BALANCER_FAILURE
-2047
Data balancing failed E_JOB_NOT_FINISHED
-2048
The current task has not been completed E_TASK_REPORT_OUT_DATE
-2049
Task report failed E_JOB_NOT_IN_SPACE
-2050
The current task is not in the graph space E_JOB_NEED_RECOVER
-2051
The current task needs to be resumed E_JOB_ALREADY_FINISH
-2052
The job status has already been failed or finished E_JOB_SUBMITTED
-2053
Job default status E_JOB_NOT_STOPPABLE
-2054
The given job do not support stop E_JOB_HAS_NO_TARGET_STORAGE
-2055
The leader distribution has not been reported, so can't send task to storage E_INVALID_JOB
-2065
Invalid task E_BACKUP_BUILDING_INDEX
-2066
Backup terminated (index being created) E_BACKUP_SPACE_NOT_FOUND
-2067
Graph space does not exist at the time of backup E_RESTORE_FAILURE
-2068
Backup recovery failed E_SESSION_NOT_FOUND
-2069
Session does not exist E_LIST_CLUSTER_FAILURE
-2070
Failed to get cluster information E_LIST_CLUSTER_GET_ABS_PATH_FAILURE
-2071
Failed to get absolute path when getting cluster information E_LIST_CLUSTER_NO_AGENT_FAILURE
-2072
Unable to get an agent when getting cluster information E_QUERY_NOT_FOUND
-2073
Query not found E_AGENT_HB_FAILUE
-2074
Failed to receive heartbeat from agent E_HOST_CAN_NOT_BE_ADDED
-2082
The host can not be added for it's not a storage host E_ACCESS_ES_FAILURE
-2090
Failed to access elasticsearch E_GRAPH_MEMORY_EXCEEDED
-2600
Graph memory exceeded E_CONSENSUS_ERROR
-3001
Consensus cannot be reached during an election E_KEY_HAS_EXISTS
-3002
Key already exists E_DATA_TYPE_MISMATCH
-3003
Data type mismatch E_INVALID_FIELD_VALUE
-3004
Invalid field value E_INVALID_OPERATION
-3005
Invalid operation E_NOT_NULLABLE
-3006
Current value is not allowed to be empty E_FIELD_UNSET
-3007
Field value must be set if the field value is NOT NULL
or has no default value E_OUT_OF_RANGE
-3008
The value is out of the range of the current type E_DATA_CONFLICT_ERROR
-3010
Data conflict E_WRITE_STALLED
-3011
Writes are delayed E_IMPROPER_DATA_TYPE
-3021
Incorrect data type E_INVALID_SPACEVIDLEN
-3022
Invalid VID length E_INVALID_FILTER
-3031
Invalid filter E_INVALID_UPDATER
-3032
Invalid field update E_INVALID_STORE
-3033
Invalid KV storage E_INVALID_PEER
-3034
Peer invalid E_RETRY_EXHAUSTED
-3035
Out of retries E_TRANSFER_LEADER_FAILED
-3036
Leader change failed E_INVALID_STAT_TYPE
-3037
Invalid stat type E_INVALID_VID
-3038
VID is invalid E_LOAD_META_FAILED
-3040
Failed to load meta information E_FAILED_TO_CHECKPOINT
-3041
Failed to generate checkpoint E_CHECKPOINT_BLOCKED
-3042
Generating checkpoint is blocked E_FILTER_OUT
-3043
Data is filtered E_INVALID_DATA
-3044
Invalid data E_MUTATE_EDGE_CONFLICT
-3045
Concurrent write conflicts on the same edge E_MUTATE_TAG_CONFLICT
-3046
Concurrent write conflict on the same vertex E_OUTDATED_LOCK
-3047
Lock is invalid E_INVALID_TASK_PARA
-3051
Invalid task parameter E_USER_CANCEL
-3052
The user canceled the task E_TASK_EXECUTION_FAILED
-3053
Task execution failed E_PLAN_IS_KILLED
-3060
Execution plan was cleared E_NO_TERM
-3070
The heartbeat process was not completed when the request was received E_OUTDATED_TERM
-3071
Out-of-date heartbeat received from the old leader (the new leader has been elected) E_WRITE_WRITE_CONFLICT
-3073
Concurrent write conflicts with later requests E_RAFT_UNKNOWN_PART
-3500
Unknown partition E_RAFT_LOG_GAP
-3501
Raft logs lag behind E_RAFT_LOG_STALE
-3502
Raft logs are out of date E_RAFT_TERM_OUT_OF_DATE
-3503
Heartbeat messages are out of date E_RAFT_UNKNOWN_APPEND_LOG
-3504
Unknown additional logs E_RAFT_WAITING_SNAPSHOT
-3511
Waiting for the snapshot to complete E_RAFT_SENDING_SNAPSHOT
-3512
There was an error sending the snapshot E_RAFT_INVALID_PEER
-3513
Invalid receiver E_RAFT_NOT_READY
-3514
Raft did not start E_RAFT_STOPPED
-3515
Raft has stopped E_RAFT_BAD_ROLE
-3516
Wrong role E_RAFT_WAL_FAIL
-3521
Write to a WAL failed E_RAFT_HOST_STOPPED
-3522
The host has stopped E_RAFT_TOO_MANY_REQUESTS
-3523
Too many requests E_RAFT_PERSIST_SNAPSHOT_FAILED
-3524
Persistent snapshot failed E_RAFT_RPC_EXCEPTION
-3525
RPC exception E_RAFT_NO_WAL_FOUND
-3526
No WAL logs found E_RAFT_HOST_PAUSED
-3527
Host suspended E_RAFT_WRITE_BLOCKED
-3528
Writes are blocked E_RAFT_BUFFER_OVERFLOW
-3529
Cache overflow E_RAFT_ATOMIC_OP_FAILED
-3530
Atomic operation failed E_LEADER_LEASE_FAILED
-3531
Leader lease expired E_RAFT_CAUGHT_UP
-3532
Data has been synchronized on Raft E_STORAGE_MEMORY_EXCEEDED
-3600
Storage memory exceeded E_LOG_GAP
-4001
Drainer logs lag behind E_LOG_STALE
-4002
Drainer logs are out of date E_INVALID_DRAINER_STORE
-4003
The drainer data storage is invalid E_SPACE_MISMATCH
-4004
Graph space mismatch E_PART_MISMATCH
-4005
Partition mismatch E_DATA_CONFLICT
-4006
Data conflict E_REQ_CONFLICT
-4007
Request conflict E_DATA_ILLEGAL
-4008
Illegal data E_CACHE_CONFIG_ERROR
-5001
Cache configuration error E_NOT_ENOUGH_SPACE
-5002
Insufficient space E_CACHE_MISS
-5003
No cache hit E_CACHE_WRITE_FAILURE
-5005
Write cache failed E_NODE_NUMBER_EXCEED_LIMIT
-7001
Number of machines exceeded the limit E_PARSING_LICENSE_FAILURE
-7002
Failed to resolve certificate E_UNKNOWN
-8000
Unknown error"},{"location":"20.appendix/history/","title":"History timeline for NebulaGraph","text":"2018.9: dutor wrote and submitted the first line of NebulaGraph database code.
2019.5: NebulaGraph v0.1.0-alpha was released as open-source.
NebulaGraph v1.0.0-beta, v1.0.0-rc1, v1.0.0-rc2, v1.0.0-rc3, and v1.0.0-rc4 were released one after another within a year thereafter.
2019.7: NebulaGraph's debut at HBaseCon1. @dangleptr
2020.3: NebulaGraph v2.0 was starting developed in the final stage of v1.0 development.
2020.6: The first major version of NebulaGraph v1.0.0 GA was released.
2021.3: The second major version of NebulaGraph v2.0 GA was released.
2021.8: NebulaGraph v2.5.0 was released.
2021.10: NebulaGraph v2.6.0 was released.
2022.2: NebulaGraph v3.0.0 was released.
2022.4: NebulaGraph v3.1.0 was released.
2022.7: NebulaGraph v3.2.0 was released.
2022.10: NebulaGraph v3.3.0 was released.
2023.2: NebulaGraph v3.4.0 was released.
2023.5: NebulaGraph v3.5.0 was released.
2023.8: NebulaGraph v3.6.0 was released.
NebulaGraph v1.x supports both RocksDB and HBase as its storage engines. NebulaGraph v2.x removes HBase supports.\u00a0\u21a9
The following are the default ports used by NebulaGraph core and peripheral tools.
No. Product / Service Type Default Description 1 NebulaGraph TCP 9669 Graph service RPC daemon listening port. Commonly used for client connections to the Graph service. 2 NebulaGraph TCP 19669 Graph service HTTP port. 3 NebulaGraph TCP 19670 Graph service HTTP/2 port. (Deprecated after version 3.x) 4 NebulaGraph TCP 9559, 95609559
is the RPC daemon listening port for Meta service. Commonly used by Graph and Storage services for querying and updating metadata in the graph database. The neighboring +1
(9560
) port is used for Raft communication between Meta services. 5 NebulaGraph TCP 19559 Meta service HTTP port. 6 NebulaGraph TCP 19560 Meta service HTTP/2 port. (Deprecated after version 3.x) 7 NebulaGraph TCP 9779, 9778, 9780 9779
is the RPC daemon listening port for Storage service. Commonly used by Graph services for data storage-related operations, such as reading, writing, or deleting data. The neighboring ports -1
(9778
) and +1
(9780
) are also used. 9778
: The port used by the Admin service, which receives Meta commands for Storage. 9780
: The port used for Raft communication between Storage services. 8 NebulaGraph TCP 19779 Storage service HTTP port. 9 NebulaGraph TCP 19780 Storage service HTTP/2 port. (Deprecated after version 3.x) 10 NebulaGraph TCP 8888 Backup and restore Agent service port. The Agent is a daemon running on each machine in the cluster, responsible for starting and stopping NebulaGraph services and uploading and downloading backup files. 11 NebulaGraph TCP 9789, 9788, 9790 9789
is the Raft Listener port for Full-text index, which reads data from Storage services and writes it to the Elasticsearch cluster.Also the port for Storage Listener in inter-cluster data synchronization, used for synchronizing Storage data from the primary cluster. The neighboring ports -1
(9788
) and +1
(9790
) are also used.9788
: An internal port.9790
: The port used for Raft communication. 12 NebulaGraph TCP 9200 NebulaGraph uses this port for HTTP communication with Elasticsearch to perform full-text search queries and manage full-text indexes. 13 NebulaGraph TCP 9569, 9568, 9570 9569
is the Meta Listener port in inter-cluster data synchronization, used for synchronizing Meta data from the primary cluster. The neighboring ports -1
(9568
) and +1
(9570
) are also used.9568
: An internal port.9570
: The port used for Raft communication. 14 NebulaGraph TCP 9889, 9888, 9890 Drainer service port in inter-cluster data synchronization, used for synchronizing Storage and Meta data to the primary cluster. The neighboring ports -1
(9888
) and +1
(9890
) are also used.9888
: An internal port.9890
: The port used for Raft communication. 15 NebulaGraph Studio TCP 7001 Studio web service port. 16 NebulaGraph Dashboard TCP 8090 Nebula HTTP Gateway dependency service port. Provides an HTTP interface for cluster services to interact with the NebulaGraph database using nGQL statements.0 17 NebulaGraph Dashboard TCP 9200 Nebula Stats Exporter dependency service port. Collects cluster performance metrics, including service IP addresses, versions, and monitoring metrics (such as query count, query latency, heartbeat latency, etc.). 18 NebulaGraph Dashboard TCP 9100 Node Exporter dependency service port. Collects resource information for machines in the cluster, including CPU, memory, load, disk, and traffic. 19 NebulaGraph Dashboard TCP 9090 Prometheus service port. Time-series database for storing monitoring data. 20 NebulaGraph Dashboard TCP 7003 Dashboard Community Edition web service port."},{"location":"20.appendix/release-notes/dashboard-comm-release-note/","title":"NebulaGraph Dashboard Community Edition release notes","text":""},{"location":"20.appendix/release-notes/dashboard-comm-release-note/#community_edition_340","title":"Community Edition 3.4.0","text":"machine
.num_queries
, and adjust the display to time series aggregation.Enhance the full-text index. #5567 #5575 #5577 #5580 #5584 #5587
The changes involved are listed below:
DeleteRange
operation. #5525MATCH
statement when querying for non-existent properties. #5634Find All Path
statement. #5621 #5640MATCH
statement causes the all()
function push-down optimization to fail. #5631MATCH
statement that returns incorrect results when querying the self-loop by the shortest path. #5636MATCH
statement that returns missing properties of edges when matching multiple hops. #5646The long-term tasks run by the Storage Service are called jobs, such as COMPACT
, FLUSH
, and STATS
. These jobs can be time-consuming if the data amount in the graph space is large. The job manager helps you run, show, stop, and recover jobs.
Note
All job management commands can be executed only after selecting a graph space.
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_balance_leader","title":"SUBMIT JOB BALANCE LEADER","text":"Starts a job to balance the distribution of all the storage leaders in all graph spaces. It returns the job ID.
For example:
nebula> SUBMIT JOB BALANCE LEADER;\n+------------+\n| New Job Id |\n+------------+\n| 33 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_compact","title":"SUBMIT JOB COMPACT","text":"The SUBMIT JOB COMPACT
statement triggers the long-term RocksDB compact
operation in the current graph space.
For more information about compact
configuration, see Storage Service configuration.
For example:
nebula> SUBMIT JOB COMPACT;\n+------------+\n| New Job Id |\n+------------+\n| 40 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_flush","title":"SUBMIT JOB FLUSH","text":"The SUBMIT JOB FLUSH
statement writes the RocksDB memfile in the memory to the hard disk in the current graph space.
For example:
nebula> SUBMIT JOB FLUSH;\n+------------+\n| New Job Id |\n+------------+\n| 96 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_stats","title":"SUBMIT JOB STATS","text":"The SUBMIT JOB STATS
statement starts a job that makes the statistics of the current graph space. Once this job succeeds, you can use the SHOW STATS
statement to list the statistics. For more information, see SHOW STATS.
Note
If the data stored in the graph space changes, in order to get the latest statistics, you have to run SUBMIT JOB STATS
again.
For example:
nebula> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 9 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_downloadingest","title":"SUBMIT JOB DOWNLOAD/INGEST","text":"The SUBMIT JOB DOWNLOAD HDFS
and SUBMIT JOB INGEST
commands are used to import the SST file into NebulaGraph. For detail, see Import data from SST files.
The SUBMIT JOB DOWNLOAD HDFS
command will download the SST file on the specified HDFS.
The SUBMIT JOB INGEST
command will import the downloaded SST file into NebulaGraph.
For example:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://192.168.10.100:9000/sst\";\n+------------+\n| New Job Id |\n+------------+\n| 10 |\n+------------+\nnebula> SUBMIT JOB INGEST;\n+------------+\n| New Job Id |\n+------------+\n| 11 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#show_job","title":"SHOW JOB","text":"The Meta Service parses a SUBMIT JOB
request into multiple tasks and assigns them to the nebula-storaged processes. The SHOW JOB <job_id>
statement shows the information about a specific job and all its tasks in the current graph space.
job_id
is returned when you run the SUBMIT JOB
statement.
For example:
nebula> SHOW JOB 8;\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n| 8 | \"STATS\" | \"FINISHED\" | 2022-10-18T08:14:45.000000 | 2022-10-18T08:14:45.000000 | \"SUCCEEDED\" |\n| 0 | \"192.168.8.129\" | \"FINISHED\" | 2022-10-18T08:14:45.000000 | 2022-10-18T08:15:13.000000 | \"SUCCEEDED\" |\n| \"Total:1\" | \"Succeeded:1\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n
The descriptions are as follows.
Parameter DescriptionJob Id(TaskId)
The first row shows the job ID and the other rows show the task IDs and the last row shows the total number of job-related tasks. Command(Dest)
The first row shows the command executed and the other rows show on which storaged processes the task is running. The last row shows the number of successful tasks related to the job. Status
Shows the status of the job or task. The last row shows the number of failed tasks related to the job. For more information, see Job status. Start Time
Shows a timestamp indicating the time when the job or task enters the RUNNING
phase. The last row shows the number of ongoing tasks related to the job. Stop Time
Shows a timestamp indicating the time when the job or task gets FINISHED
, FAILED
, or STOPPED
. Error Code
The error code of job."},{"location":"3.ngql-guide/4.job-statements/#job_status","title":"Job status","text":"The descriptions are as follows.
Status Description QUEUE The job or task is waiting in a queue. TheStart Time
is empty in this phase. RUNNING The job or task is running. The Start Time
shows the beginning time of this phase. FINISHED The job or task is successfully finished. The Stop Time
shows the time when the job or task enters this phase. FAILED The job or task has failed. The Stop Time
shows the time when the job or task enters this phase. STOPPED The job or task is stopped without running. The Stop Time
shows the time when the job or task enters this phase. REMOVED The job or task is removed. The description of switching the status is described as follows.
Queue -- running -- finished -- removed\n \\ \\ /\n \\ \\ -- failed -- /\n \\ \\ /\n \\ ---------- stopped -/\n
"},{"location":"3.ngql-guide/4.job-statements/#show_jobs","title":"SHOW JOBS","text":"The SHOW JOBS
statement lists all the unexpired jobs in the current graph space.
The default job expiration interval is one week. You can change it by modifying the job_expired_secs
parameter of the Meta Service. For how to modify job_expired_secs
, see Meta Service configuration.
For example:
nebula> SHOW JOBS;\n+--------+---------------------+------------+----------------------------+----------------------------+\n| Job Id | Command | Status | Start Time | Stop Time |\n+--------+---------------------+------------+----------------------------+----------------------------+\n| 34 | \"STATS\" | \"FINISHED\" | 2021-11-01T03:32:27.000000 | 2021-11-01T03:32:27.000000 |\n| 33 | \"FLUSH\" | \"FINISHED\" | 2021-11-01T03:32:15.000000 | 2021-11-01T03:32:15.000000 |\n| 32 | \"COMPACT\" | \"FINISHED\" | 2021-11-01T03:32:06.000000 | 2021-11-01T03:32:06.000000 |\n| 31 | \"REBUILD_TAG_INDEX\" | \"FINISHED\" | 2021-10-29T05:39:16.000000 | 2021-10-29T05:39:17.000000 |\n| 10 | \"COMPACT\" | \"FINISHED\" | 2021-10-26T02:27:05.000000 | 2021-10-26T02:27:05.000000 |\n+--------+---------------------+------------+----------------------------+----------------------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#stop_job","title":"STOP JOB","text":"The STOP JOB <job_id>
statement stops jobs that are not finished in the current graph space.
For example:
nebula> STOP JOB 22;\n+---------------+\n| Result |\n+---------------+\n| \"Job stopped\" |\n+---------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#recover_job","title":"RECOVER JOB","text":"The RECOVER JOB [<job_id>]
statement re-executes the jobs that status is FAILED
or STOPPED
in the current graph space and returns the number of recovered jobs. If <job_id>
is not specified, re-execution is performed from the earliest job and the number of jobs that have been recovered is returned.
For example:
nebula> RECOVER JOB;\n+-------------------+\n| Recovered job num |\n+-------------------+\n| 5 job recovered |\n+-------------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#faq","title":"FAQ","text":""},{"location":"3.ngql-guide/4.job-statements/#how_to_troubleshoot_job_problems","title":"How to troubleshoot job problems?","text":"The SUBMIT JOB
operations use the HTTP port. Please check if the HTTP ports on the machines where the Storage Service is running are working well. You can use the following command to debug.
curl \"http://{storaged-ip}:19779/admin?space={space_name}&op=compact\"\n
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/","title":"NebulaGraph Query Language (nGQL)","text":"This topic gives an introduction to the query language of NebulaGraph, nGQL.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#what_is_ngql","title":"What is nGQL","text":"nGQL is a declarative graph query language for NebulaGraph. It allows expressive and efficient graph patterns. nGQL is designed for both developers and operations professionals. nGQL is an SQL-like query language, so it's easy to learn.
nGQL is a project in progress. New features and optimizations are done steadily. There can be differences between syntax and implementation. Submit an issue to inform the NebulaGraph team if you find a new issue of this type. NebulaGraph 3.0 or later releases will support openCypher 9.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#what_can_ngql_do","title":"What can nGQL do","text":"Users can download the example data Basketballplayer in NebulaGraph. After downloading the example data, you can import it to NebulaGraph by using the -f
option in NebulaGraph Console.
Note
Ensure that you have executed the ADD HOSTS
command to add the Storage service to your NebulaGraph cluster before importing the example data. For more information, see Manage Storage hosts.
Refer to the following standards in nGQL:
In template code, any token that is not a keyword, a literal value, or punctuation is a placeholder identifier or a placeholder value.
For details of the symbols in nGQL syntax, see the following table:
Token Meaning < > name of a syntactic element : formula that defines an element [ ] optional elements { } explicitly specified elements | complete alternative elements ... may be repeated any number of timesFor example, create vertices in nGQL syntax:
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES <vid>: ([prop_value_list])\ntag_props:\n tag_name ([prop_name_list])\nprop_name_list:\n [prop_name [, prop_name] ...]\nprop_value_list:\n [prop_value [, prop_value] ...] \n
Example statement:
nebula> CREATE TAG IF NOT EXISTS player(name string, age int);\n
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#about_opencypher_compatibility","title":"About openCypher compatibility","text":""},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#native_ngql_and_opencypher","title":"Native nGQL and openCypher","text":"Native nGQL is the part of a graph query language designed and implemented by NebulaGraph. OpenCypher is a graph query language maintained by openCypher Implementers Group.
The latest release is openCypher 9. The compatible parts of openCypher in nGQL are called openCypher compatible sentences (short as openCypher).
Note
nGQL
= native nGQL
+ openCypher compatible sentences
NO.
Compatibility with openCypher
nGQL is designed to be compatible with part of DQL (match, optional match, with, etc.).
Users can search in this manual with the keyword compatibility
to find major compatibility issues.
Multiple known incompatible items are listed in NebulaGraph Issues. Submit an issue with the incompatible
tag if you find a new issue of this type.
The following are some major differences (by design incompatible) between nGQL and openCypher.
Category openCypher 9 nGQL Schema Optional Schema Strong Schema Equality operator=
==
Math exponentiation ^
^
is not supported. Use pow(x, y) instead. Edge rank No such concept. edge rank (reference by @) Statement - All DMLs (CREATE
, MERGE
, etc) of openCypher 9. Label and tag A label is used for searching a vertex, namely an index of vertex. A tag defines the type of a vertex and its corresponding properties. It cannot be used as an index. Pre-compiling and parameterized queries Support Parameterized queries are supported, but precompiling is not. Compatibility
OpenCypher 9 and Cypher have some differences in grammar and licence. For example,
Cypher requires that All Cypher statements are explicitly run within a transaction. While openCypher has no such requirement. And nGQL does not support transactions.
Cypher has a variety of constraints, including Unique node property constraints, Node property existence constraints, Relationship property existence constraints, and Node key constraints. While OpenCypher has no such constraints. As a strong schema system, most of the constraints mentioned above can be solved through schema definitions (including NOT NULL) in nGQL. The only function that cannot be supported is the UNIQUE constraint.
Cypher has APoC, while openCypher 9 does not have APoC. Cypher has Blot protocol support requirements, while openCypher 9 does not.
Users can find more than 2500 nGQL examples in the features directory on the NebulaGraph GitHub page.
The features
directory consists of .feature
files. Each file records scenarios that you can use as nGQL examples. Here is an example:
Feature: Basic match\n\n Background:\n Given a graph with space named \"basketballplayer\"\n\n Scenario: Single node\n When executing query:\n \"\"\"\n MATCH (v:player {name: \"Yao Ming\"}) RETURN v;\n \"\"\"\n Then the result should be, in any order, with relax comparison:\n | v |\n | (\"player133\" :player{age: 38, name: \"Yao Ming\"}) |\n\n Scenario: One step\n When executing query:\n \"\"\"\n MATCH (v1:player{name: \"LeBron James\"}) -[r]-> (v2)\n RETURN type(r) AS Type, v2.player.name AS Name\n \"\"\"\n Then the result should be, in any order:\n\n | Type | Name |\n | \"follow\" | \"Ray Allen\" |\n | \"serve\" | \"Lakers\" |\n | \"serve\" | \"Heat\" |\n | \"serve\" | \"Cavaliers\" |\n\nFeature: Comparison of where clause\n\n Background:\n Given a graph with space named \"basketballplayer\"\n\n Scenario: push edge props filter down\n When profiling query:\n \"\"\"\n GO FROM \"player100\" OVER follow \n WHERE properties(edge).degree IN [v IN [95,99] WHERE v > 0] \n YIELD dst(edge), properties(edge).degree\n \"\"\"\n Then the result should be, in any order:\n | follow._dst | follow.degree |\n | \"player101\" | 95 |\n | \"player125\" | 95 |\n And the execution plan should be:\n | id | name | dependencies | operator info |\n | 0 | Project | 1 | |\n | 1 | GetNeighbors | 2 | {\"filter\": \"(properties(edge).degree IN [v IN [95,99] WHERE (v>0)])\"} |\n | 2 | Start | | |\n
The keywords in the preceding example are described as follows.
Keyword DescriptionFeature
Describes the topic of the current .feature
file. Background
Describes the background information of the current .feature
file. Given
Describes the prerequisites of running the test statements in the current .feature
file. Scenario
Describes the scenarios. If there is the @skip
before one Scenario
, this scenario may not work and do not use it as a working example in a production environment. When
Describes the nGQL statement to be executed. It can be a executing query
or profiling query
. Then
Describes the expected return results of running the statement in the When
clause. If the return results in your environment do not match the results described in the .feature
file, submit an issue to inform the NebulaGraph team. And
Describes the side effects of running the statement in the When
clause. @skip
This test case will be skipped. Commonly, the to-be-tested code is not ready. Welcome to add more tck case and return automatically to the using statements in CI/CD.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#does_it_support_tinkerpop_gremlin","title":"Does it support TinkerPop Gremlin?","text":"No. And no plan to support that.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#does_nebulagraph_support_w3c_rdf_sparql_or_graphql","title":"Does NebulaGraph support W3C RDF (SPARQL) or GraphQL?","text":"No. And no plan to support that.
The data model of NebulaGraph is the property graph. And as a strong schema system, NebulaGraph does not support RDF.
NebulaGraph Query Language does not support SPARQL
nor GraphQL
.
Patterns and graph pattern matching are the very heart of a graph query language. This topic will describe the patterns in NebulaGraph, some of which have not yet been implemented.
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_vertices","title":"Patterns for vertices","text":"A vertex is described using a pair of parentheses and is typically given a name. For example:
(a)\n
This simple pattern describes a single vertex and names that vertex using the variable a
.
A more powerful construct is a pattern that describes multiple vertices and edges between them. Patterns describe an edge by employing an arrow between two vertices. For example:
(a)-[]->(b)\n
This pattern describes a very simple data structure: two vertices and a single edge from one to the other. In this example, the two vertices are named as a
and b
respectively and the edge is directed
: it goes from a
to b
.
This manner of describing vertices and edges can be extended to cover an arbitrary number of vertices and the edges between them, for example:
(a)-[]->(b)<-[]-(c)\n
Such a series of connected vertices and edges is called a path
.
Note that the naming of the vertices in these patterns is only necessary when one needs to refer to the same vertex again, either later in the pattern or elsewhere in the query. If not, the name may be omitted as follows:
(a)-[]->()<-[]-(c)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_tags","title":"Patterns for tags","text":"Note
The concept of tag
in nGQL has a few differences from that of label
in openCypher. For example, users must create a tag
before using it. And a tag
also defines the type of properties.
In addition to simply describing the vertices in the graphs, patterns can also describe the tags of the vertices. For example:
(a:User)-[]->(b)\n
Patterns can also describe a vertex that has multiple tags. For example:
(a:User:Admin)-[]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_properties","title":"Patterns for properties","text":"Vertices and edges are the fundamental elements in a graph. In nGQL, properties are added to them for richer models.
In the patterns, the properties can be expressed as follows: some key-value pairs are enclosed in curly brackets and separated by commas, and the tag or edge type to which a property belongs must be specified.
For example, a vertex with two properties will be like:
(a:player{name: \"Tim Duncan\", age: 42})\n
One of the edges that connect to this vertex can be like:
(a)-[e:follow{degree: 95}]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_edges","title":"Patterns for edges","text":"The simplest way to describe an edge is by using the arrow between two vertices, as in the previous examples.
Users can describe an edge and its direction using the following statement. If users do not care about its direction, the arrowhead can be omitted. For example:
(a)-[]-(b)\n
Like vertices, edges can also be named. A pair of square brackets will be used to separate the arrow and the variable will be placed between them. For example:
(a)-[r]->(b)\n
Like the tags on vertices, edges can also have types. To describe an edge with a specific type, use the pattern as follows:
(a)-[r:REL_TYPE]->(b)\n
An edge can only have one edge type. But if we'd like to describe some data such that the edge could have a set of types, then they can all be listed in the pattern, separating them with the pipe symbol |
like this:
(a)-[r:TYPE1|TYPE2]->(b)\n
Like vertices, the name of an edge can be omitted. For example:
(a)-[:REL_TYPE]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#variable-length_pattern","title":"Variable-length pattern","text":"Rather than describing a long path using a sequence of many vertex and edge descriptions in a pattern, many edges (and the intermediate vertices) can be described by specifying a length in the edge description of a pattern. For example:
(a)-[*2]->(b)\n
The following pattern describes a graph of three vertices and two edges, all in one path (a path of length 2). It is equivalent to:
(a)-[]->()-[]->(b)\n
The range of lengths can also be specified. Such edge patterns are called variable-length edges
. For example:
(a)-[*3..5]->(b)\n
The preceding example defines a path with a minimum length of 3 and a maximum length of 5.
It describes a graph of either 4 vertices and 3 edges, 5 vertices and 4 edges, or 6 vertices and 5 edges, all connected in a single path.
You may specify either the upper limit or lower limit of the length range, or neither of them, for example:
(a)-[*..5]->(b) // The minimum length is 1 and the maximum length is 5.\n(a)-[*3..]->(b) // The minimum length is 3 and the maximum length is infinity.\n(a)-[*]->(b) // The minimum length is 1 and the maximum length is infinity.\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#assigning_to_path_variables","title":"Assigning to path variables","text":"As described above, a series of connected vertices and edges is called a path
. nGQL allows paths to be named using variables. For example:
p = (a)-[*3..5]->(b)\n
Users can do this in the MATCH
statement.
This topic will describe the comments in nGQL.
Legacy version compatibility
#
, --
, //
, /* */
.--
cannot be used as comments.nebula> RETURN 1+1; # This comment continues to the end of this line.\nnebula> RETURN 1+1; // This comment continues to the end of this line.\nnebula> RETURN 1 /* This is an in-line comment. */ + 1 == 2;\nnebula> RETURN 11 + \\\n/* Multi-line comment. \\\nUse a backslash as a line break. \\\n*/ 12;\n
Note
\\
in a line indicates a line break.#
or //
, the statement is not executed and the error StatementEmpty
is returned.\\
at the end of every line, even in multi-line comments /* */
.\\
as a line break./* openCypher style:\nThe following comment\nspans more than\none line */\nMATCH (n:label)\nRETURN n;\n
/* nGQL style: \\\nThe following comment \\\nspans more than \\\none line */ \\\nMATCH (n:tag) \\\nRETURN n;\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/","title":"Identifier case sensitivity","text":""},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#identifiers_are_case-sensitive","title":"Identifiers are Case-Sensitive","text":"The following statements will not work because they refer to two different spaces, i.e. my_space
and MY_SPACE
.
nebula> CREATE SPACE IF NOT EXISTS my_space (vid_type=FIXED_STRING(30));\nnebula> use MY_SPACE;\n[ERROR (-1005)]: SpaceNotFound:\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#keywords_and_reserved_words_are_case-insensitive","title":"Keywords and Reserved Words are Case-Insensitive","text":"The following statements are equivalent since show
and spaces
are keywords.
nebula> show spaces; \nnebula> SHOW SPACES;\nnebula> SHOW spaces;\nnebula> show SPACES;\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#functions_are_case-insensitive","title":"Functions are Case-Insensitive","text":"Functions are case-insensitive. For example, count()
, COUNT()
, and couNT()
are equivalent.
nebula> WITH [NULL, 1, 1, 2, 2] As a \\\n UNWIND a AS b \\\n RETURN count(b), COUNT(*), couNT(DISTINCT b);\n+----------+----------+-------------------+\n| count(b) | COUNT(*) | couNT(distinct b) |\n+----------+----------+-------------------+\n| 4 | 5 | 2 |\n+----------+----------+-------------------+\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/","title":"Keywords","text":"Keywords in nGQL are words with particular meanings, such as CREATE
and TAG
in the CREATE TAG
statement. Keywords that require special processing to be used as identifiers are referred to as reserved keywords
, while the part of keywords that can be used directly as identifiers are called non-reserved keywords
.
It is not recommended to use keywords to identify schemas. If you must use keywords as identifiers, pay attention to the following restrictions:
To use non-reserved keywords as identifiers:
Note
Keywords are case-insensitive.
nebula> CREATE TAG TAG(name string);\n[ERROR (-1004)]: SyntaxError: syntax error near `TAG'\n\nnebula> CREATE TAG `TAG` (name string);\nExecution succeeded\n\nnebula> CREATE TAG SPACE(name string);\nExecution succeeded\n\nnebula> CREATE TAG \u4e2d\u6587(\u7b80\u4f53 string);\nExecution succeeded\n\nnebula> CREATE TAG `\uffe5%special characters&*+-*/` (`q~\uff01\uff08\uff09= wer` string);\nExecution succeeded\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/#reserved_keywords","title":"Reserved keywords","text":"ACROSS\nADD\nALTER\nAND\nAS\nASC\nASCENDING\nBALANCE\nBOOL\nBY\nCASE\nCHANGE\nCOMPACT\nCREATE\nDATE\nDATETIME\nDELETE\nDESC\nDESCENDING\nDESCRIBE\nDISTINCT\nDOUBLE\nDOWNLOAD\nDROP\nDURATION\nEDGE\nEDGES\nEXISTS\nEXPLAIN\nFALSE\nFETCH\nFIND\nFIXED_STRING\nFLOAT\nFLUSH\nFROM\nGEOGRAPHY\nGET\nGO\nGRANT\nIF\nIGNORE_EXISTED_INDEX\nIN\nINDEX\nINDEXES\nINGEST\nINSERT\nINT\nINT16\nINT32\nINT64\nINT8\nINTERSECT\nIS\nJOIN\nLEFT\nLIST\nLOOKUP\nMAP\nMATCH\nMINUS\nNO\nNOT\nNULL\nOF\nON\nOR\nORDER\nOVER\nOVERWRITE\nPATH\nPROP\nREBUILD\nRECOVER\nREMOVE\nRESTART\nRETURN\nREVERSELY\nREVOKE\nSET\nSHOW\nSTEP\nSTEPS\nSTOP\nSTRING\nSUBMIT\nTAG\nTAGS\nTIME\nTIMESTAMP\nTO\nTRUE\nUNION\nUNWIND\nUPDATE\nUPSERT\nUPTO\nUSE\nVERTEX\nVERTICES\nWHEN\nWHERE\nWITH\nXOR\nYIELD\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/#non-reserved_keywords","title":"Non-reserved keywords","text":"ACCOUNT\nADMIN\nAGENT\nALL\nALLSHORTESTPATHS\nANALYZER\nANY\nATOMIC_EDGE\nAUTO\nBASIC\nBIDIRECT\nBOTH\nCHARSET\nCLEAR\nCLIENTS\nCOLLATE\nCOLLATION\nCOMMENT\nCONFIGS\nCONTAINS\nDATA\nDBA\nDEFAULT\nDIVIDE\nDRAINER\nDRAINERS\nELASTICSEARCH\nELSE\nEND\nENDS\nES_QUERY\nFORCE\nFORMAT\nFULLTEXT\nGOD\nGRANTS\nGRAPH\nGROUP\nGROUPS\nGUEST\nHDFS\nHOST\nHOSTS\nHTTP\nHTTPS\nINTO\nIP\nJOB\nJOBS\nKILL\nLEADER\nLIMIT\nLINESTRING\nLISTENER\nLOCAL\nMERGE\nMETA\nNEW\nNOLOOP\nNONE\nOFFSET\nOPTIONAL\nOUT\nPART\nPARTITION_NUM\nPARTS\nPASSWORD\nPLAN\nPOINT\nPOLYGON\nPROFILE\nQUERIES\nQUERY\nREAD\nREDUCE\nRENAME\nREPLICA_FACTOR\nRESET\nROLE\nROLES\nS2_MAX_CELLS\nS2_MAX_LEVEL\nSAMPLE\nSEARCH\nSERVICE\nSESSION\nSESSIONS\nSHORTEST\nSHORTESTPATH\nSIGN\nSINGLE\nSKIP\nSNAPSHOT\nSNAPSHOTS\nSPACE\nSPACES\nSTARTS\nSTATS\nSTATUS\nSTORAGE\nSUBGRAPH\nSYNC\nTEXT\nTEXT_SEARCH\nTHEN\nTOP\nTTL_COL\nTTL_DURATION\nUSER\nUSERS\nUUID\nVALUE\nVALUES\nVARIABLES\nVID_TYPE\nWHITELIST\nWRITE\nZONE\nZONES\n
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/","title":"nGQL style guide","text":"nGQL does not have strict formatting requirements, but creating nGQL statements according to an appropriate and uniform style can improve readability and avoid ambiguity. Using the same nGQL style in the same organization or project helps reduce maintenance costs and avoid problems caused by format confusion or misunderstanding. This topic will provide a style guide for writing nGQL statements.
Compatibility
The styles of nGQL and Cypher Style Guide are different.
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#newline","title":"Newline","text":"Start a new line to write a clause.
Not recommended:
GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id;\n
Recommended:
GO FROM \"player100\" \\\nOVER follow REVERSELY \\\nYIELD src(edge) AS id;\n
Start a new line to write different statements in a composite statement.
Not recommended:
GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id | GO FROM $-.id \\\nOVER serve WHERE properties($^).age > 20 YIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n
Recommended:
GO FROM \"player100\" \\\nOVER follow REVERSELY \\\nYIELD src(edge) AS id | \\\nGO FROM $-.id OVER serve \\\nWHERE properties($^).age > 20 \\\nYIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n
If the clause exceeds 80 characters, start a new line at the appropriate place.
Not recommended:
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\nWHERE (v2.player.name STARTS WITH \"Y\" AND v2.player.age > 35 AND v2.player.age < v.player.age) OR (v2.player.name STARTS WITH \"T\" AND v2.player.age < 45 AND v2.player.age > v.player.age) \\\nRETURN v2;\n
Recommended:
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\nWHERE (v2.player.name STARTS WITH \"Y\" AND v2.player.age > 35 AND v2.player.age < v.player.age) \\\nOR (v2.player.name STARTS WITH \"T\" AND v2.player.age < 45 AND v2.player.age > v.player.age) \\\nRETURN v2;\n
Note
If needed, you can also start a new line for better understanding, even if the clause does not exceed 80 characters.
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#identifier_naming","title":"Identifier naming","text":"In nGQL statements, characters other than keywords, punctuation marks, and blanks are all identifiers. Recommended methods to name the identifiers are as follows.
Use singular nouns to name tags, and use the base form of verbs or verb phrases to form Edge types.
Not recommended:
MATCH p=(v:players)-[e:are_following]-(v2) \\\nRETURN nodes(p);\n
Recommended:
MATCH p=(v:player)-[e:follow]-(v2) \\\nRETURN nodes(p);\n
Use the snake case to name identifiers, and connect words with underscores (_) with all the letters lowercase.
Not recommended:
MATCH (v:basketballTeam) \\\nRETURN v;\n
Recommended:
MATCH (v:basketball_team) \\\nRETURN v;\n
Use uppercase keywords and lowercase variables.
Not recommended:
match (V:player) return V limit 5;\n
Recommended:
MATCH (v:player) RETURN v LIMIT 5;\n
Start a new line on the right side of the arrow indicating an edge when writing patterns.
Not recommended:
MATCH (v:player{name: \"Tim Duncan\", age: 42}) \\\n-[e:follow]->()-[e:serve]->()<--(v2) \\\nRETURN v, e, v2;\n
Recommended:
MATCH (v:player{name: \"Tim Duncan\", age: 42})-[e:follow]-> \\\n()-[e:serve]->()<--(v2) \\\nRETURN v, e, v2;\n
Anonymize the vertices and edges that do not need to be queried.
Not recommended:
MATCH (v:player)-[e:follow]->(v2) \\\nRETURN v;\n
Recommended:
MATCH (v:player)-[:follow]->() \\\nRETURN v;\n
Place named vertices in front of anonymous vertices.
Not recommended:
MATCH ()-[:follow]->(v) \\\nRETURN v;\n
Recommended:
MATCH (v)<-[:follow]-() \\\nRETURN v;\n
The strings should be surrounded by double quotes.
Not recommended:
RETURN 'Hello Nebula!';\n
Recommended:
RETURN \"Hello Nebula!\\\"123\\\"\";\n
Note
When single or double quotes need to be nested in a string, use a backslash () to escape. For example:
RETURN \"\\\"NebulaGraph is amazing,\\\" the user says.\";\n
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#statement_termination","title":"Statement termination","text":"End the nGQL statements with an English semicolon (;).
Not recommended:
FETCH PROP ON player \"player100\" YIELD properties(vertex)\n
Recommended:
FETCH PROP ON player \"player100\" YIELD properties(vertex);\n
Use a pipe (|) to separate a composite statement, and end the statement with an English semicolon at the end of the last line. Using an English semicolon before a pipe will cause the statement to fail.
Not supported:
GO FROM \"player100\" \\\nOVER follow \\\nYIELD dst(edge) AS id; | \\\nGO FROM $-.id \\\nOVER serve \\\nYIELD properties($$).name AS Team, properties($^).name AS Player;\n
Supported:
GO FROM \"player100\" \\\nOVER follow \\\nYIELD dst(edge) AS id | \\\nGO FROM $-.id \\\nOVER serve \\\nYIELD properties($$).name AS Team, properties($^).name AS Player;\n
In a composite statement that contains user-defined variables, use an English semicolon to end the statements that define the variables. If you do not follow the rules to add a semicolon or use a pipe to end the composite statement, the execution will fail.
Not supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
Not supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id | \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
Supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id; \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
CREATE TAG
creates a tag with the given name in a graph space.
Tags in nGQL are similar to labels in openCypher. But they are also quite different. For example, the ways to create them are different.
CREATE
statements.CREATE TAG
statements. Tags in nGQL are more like tables in MySQL.Running the CREATE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
To create a tag in a specific graph space, you must specify the current working space with the USE
statement.
CREATE TAG [IF NOT EXISTS] <tag_name>\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
Parameter Description IF NOT EXISTS
Detects if the tag that you want to create exists. If it does not exist, a new one will be created. The tag existence detection here only compares the tag names (excluding properties). <tag_name>
1. Each tag name in the graph space must be unique. 2. Tag names cannot be modified after they are set. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name a tag in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a tag name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <prop_name>
The name of the property. It must be unique for each tag. The rules for permitted property names are the same as those for tag names. <data_type>
Shows the data type of each property. For a full description of the property data types, see Data types and Boolean. NULL | NOT NULL
Specifies if the property supports NULL | NOT NULL
. The default value is NULL
. DEFAULT
Specifies a default value for a property. The default value can be a literal value or an expression supported by NebulaGraph. If no value is specified, the default value is used when inserting a new vertex. COMMENT
The remarks of a certain property or the tag itself. The maximum length is 256 bytes. By default, there will be no comments on a tag. TTL_DURATION
Specifies the life cycle for the property. The property that exceeds the specified TTL expires. The expiration threshold is the TTL_COL
value plus the TTL_DURATION
. The default value of TTL_DURATION
is 0
. It means the data never expires. TTL_COL
Specifies the property to set a timeout on. The data type of the property must be int
or timestamp
. A tag can only specify one field as TTL_COL
. For more information on TTL, see TTL options."},{"location":"3.ngql-guide/10.tag-statements/1.create-tag/#examples","title":"Examples","text":"nebula> CREATE TAG IF NOT EXISTS player(name string, age int);\n\n# The following example creates a tag with no properties.\nnebula> CREATE TAG IF NOT EXISTS no_property();\u00a0\n\n# The following example creates a tag with a default value.\nnebula> CREATE TAG IF NOT EXISTS player_with_default(name string, age int DEFAULT 20);\n\n# In the following example, the TTL of the create_time field is set to be 100 seconds.\nnebula> CREATE TAG IF NOT EXISTS woman(name string, age int, \\\n married bool, salary double, create_time timestamp) \\\n TTL_DURATION = 100, TTL_COL = \"create_time\";\n
"},{"location":"3.ngql-guide/10.tag-statements/1.create-tag/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly created tag may fail because the creation of the tag is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
DROP TAG
drops a tag with the given name in the current working graph space.
A vertex can have one or more tags.
This operation only deletes the Schema data. All the files or directories in the disk will not be deleted directly until the next compaction.
Compatibility
In NebulaGraph master, inserting vertex without tag is not supported by default. If you want to use the vertex without tags, add --graph_use_vertex_key=true
to the configuration files (nebula-graphd.conf
) of all Graph services in the cluster, and add --use_vertex_key=true
to the configuration files (nebula-storaged.conf
) of all Storage services in the cluster.
DROP TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
) will be returned when you run the DROP TAG
statement. To drop an index, see DROP INDEX.DROP TAG [IF EXISTS] <tag_name>;\n
IF NOT EXISTS
: Detects if the tag that you want to drop exists. Only when it exists will it be dropped.tag_name
: Specifies the tag name that you want to drop. You can drop only one tag in one statement.nebula> CREATE TAG IF NOT EXISTS test(p1 string, p2 int);\nnebula> DROP TAG test;\n
"},{"location":"3.ngql-guide/10.tag-statements/3.alter-tag/","title":"ALTER TAG","text":"ALTER TAG
alters the structure of a tag with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration.
ALTER TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
will occur when you ALTER TAG
. For more information on dropping an index, see DROP INDEX.ALTER TAG <tag_name>\n <alter_definition> [[, alter_definition] ...]\n [ttl_definition [, ttl_definition] ... ]\n [COMMENT '<comment>'];\n\nalter_definition:\n| ADD (prop_name data_type [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'])\n| DROP (prop_name)\n| CHANGE (prop_name data_type [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'])\n\nttl_definition:\n TTL_DURATION = ttl_duration, TTL_COL = prop_name\n
tag_name
: Specifies the tag name that you want to alter. You can alter only one tag in one statement. Before you alter a tag, make sure that the tag exists in the current working graph space. If the tag does not exist, an error will occur when you alter it.ADD
, DROP
, and CHANGE
clauses are permitted in a single ALTER TAG
statement, separated by commas.NOT NULL
using ADD
or CHANGE
, a default value must be specified for the property, that is, the value of DEFAULT
must be specified.When using CHANGE
to modify the data type of a property:
FIXED_STRING
or an INT
can be increased. The length of a STRING
or an INT
cannot be decreased.nebula> CREATE TAG IF NOT EXISTS t1 (p1 string, p2 int);\nnebula> ALTER TAG t1 ADD (p3 int32, fixed_string(10));\nnebula> ALTER TAG t1 TTL_DURATION = 2, TTL_COL = \"p2\";\nnebula> ALTER TAG t1 COMMENT = 'test1';\nnebula> ALTER TAG t1 ADD (p5 double NOT NULL DEFAULT 0.4 COMMENT 'p5') COMMENT='test2';\n// Change the data type of p3 in the TAG t1 from INT32 to INT64, and that of p4 from FIXED_STRING(10) to STRING.\nnebula> ALTER TAG t1 CHANGE (p3 int64, p4 string);\n[ERROR(-1005)]: Unsupported!\n
"},{"location":"3.ngql-guide/10.tag-statements/3.alter-tag/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly altered tag may fail because the alteration of the tag is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
The SHOW TAGS
statement shows the name of all tags in the current graph space.
You do not need any privileges for the graph space to run the SHOW TAGS
statement. But the returned results are different based on role privileges.
SHOW TAGS;\n
"},{"location":"3.ngql-guide/10.tag-statements/4.show-tags/#examples","title":"Examples","text":"nebula> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"team\" |\n+----------+\n
"},{"location":"3.ngql-guide/10.tag-statements/5.describe-tag/","title":"DESCRIBE TAG","text":"DESCRIBE TAG
returns the information about a tag with the given name in a graph space, such as field names, data type, and so on.
Running the DESCRIBE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DESC[RIBE] TAG <tag_name>;\n
You can use DESC
instead of DESCRIBE
for short.
nebula> DESCRIBE TAG player;\n+--------+----------+-------+---------+---------+\n| Field | Type | Null | Default | Comment |\n+--------+----------+-------+---------+---------+\n| \"name\" | \"string\" | \"YES\" | | |\n| \"age\" | \"int64\" | \"YES\" | | |\n+--------+----------+-------+---------+---------+\n
"},{"location":"3.ngql-guide/10.tag-statements/6.delete-tag/","title":"DELETE TAG","text":"DELETE TAG
deletes a tag with the given name on a specified vertex.
Running the DELETE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DELETE TAG <tag_name_list> FROM <VID_list>;\n
tag_name_list
: The names of the tags you want to delete. Multiple tags are separated with commas (,). *
means all tags.VID
: The VIDs of the vertices from which you want to delete the tags. Multiple VIDs are separated with commas (,). nebula> CREATE TAG IF NOT EXISTS test1(p1 string, p2 int);\nnebula> CREATE TAG IF NOT EXISTS test2(p3 string, p4 int);\nnebula> INSERT VERTEX test1(p1, p2),test2(p3, p4) VALUES \"test\":(\"123\", 1, \"456\", 2);\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+------------------------------------------------------------+\n| v |\n+------------------------------------------------------------+\n| (\"test\" :test1{p1: \"123\", p2: 1} :test2{p3: \"456\", p4: 2}) |\n+------------------------------------------------------------+\nnebula> DELETE TAG test1 FROM \"test\";\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+-----------------------------------+\n| v |\n+-----------------------------------+\n| (\"test\" :test2{p3: \"456\", p4: 2}) |\n+-----------------------------------+\nnebula> DELETE TAG * FROM \"test\";\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+---+\n| v |\n+---+\n+---+\n
Compatibility
REMOVE v:LABEL
to delete the tag LABEL
of the vertex v
.DELETE TAG
and DROP TAG
have the same semantics but different syntax. In nGQL, use DELETE TAG
.OpenCypher has the features of SET label
and REMOVE label
to speed up the process of querying or labeling.
NebulaGraph achieves the same operations by creating and inserting tags to an existing vertex, which can quickly query vertices based on the tag name. Users can also run DELETE TAG
to delete some vertices that are no longer needed.
For example, in the basketballplayer
data set, some basketball players are also team shareholders. Users can create an index for the shareholder tag shareholder
for quick search. If the player is no longer a shareholder, users can delete the shareholder tag of the corresponding player by DELETE TAG
.
//This example creates the shareholder tag and index.\nnebula> CREATE TAG IF NOT EXISTS shareholder();\nnebula> CREATE TAG INDEX IF NOT EXISTS shareholder_tag on shareholder();\n\n//This example adds a tag on the vertex.\nnebula> INSERT VERTEX shareholder() VALUES \"player100\":();\nnebula> INSERT VERTEX shareholder() VALUES \"player101\":();\n\n//This example queries all the shareholders.\nnebula> MATCH (v:shareholder) RETURN v;\n+--------------------------------------------------------------------+\n| v |\n+--------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :shareholder{}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"} :shareholder{}) |\n+--------------------------------------------------------------------+\n\nnebula> LOOKUP ON shareholder YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n+-------------+\n\n//In this example, the \"player100\" is no longer a shareholder.\nnebula> DELETE TAG shareholder FROM \"player100\";\nnebula> LOOKUP ON shareholder YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player101\" |\n+-------------+\n
Note
If the index is created after inserting the test data, use the REBUILD TAG INDEX <index_name_list>;
statement to rebuild the index.
CREATE EDGE
creates an edge type with the given name in a graph space.
Edge types in nGQL are similar to relationship types in openCypher. But they are also quite different. For example, the ways to create them are different.
CREATE
statements.CREATE EDGE
statements. Edge types in nGQL are more like tables in MySQL.Running the CREATE EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
To create an edge type in a specific graph space, you must specify the current working space with the USE
statement.
CREATE EDGE [IF NOT EXISTS] <edge_type_name>\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
Parameter Description IF NOT EXISTS
Detects if the edge type that you want to create exists. If it does not exist, a new one will be created. The edge type existence detection here only compares the edge type names (excluding properties). <edge_type_name>
1. The edge type name must be unique in a graph space. 2. Once the edge type name is set, it can not be altered. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name an edge type in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in an edge type name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <prop_name>
The name of the property. It must be unique for each edge type. The rules for permitted property names are the same as those for edge type names. <data_type>
Shows the data type of each property. For a full description of the property data types, see Data types and Boolean. NULL | NOT NULL
Specifies if the property supports NULL | NOT NULL
. The default value is NULL
. DEFAULT
must be specified if NOT NULL
is set. DEFAULT
Specifies a default value for a property. The default value can be a literal value or an expression supported by NebulaGraph. If no value is specified, the default value is used when inserting a new edge. COMMENT
The remarks of a certain property or the edge type itself. The maximum length is 256 bytes. By default, there will be no comments on an edge type. TTL_DURATION
Specifies the life cycle for the property. The property that exceeds the specified TTL expires. The expiration threshold is the TTL_COL
value plus the TTL_DURATION
. The default value of TTL_DURATION
is 0
. It means the data never expires. TTL_COL
Specifies the property to set a timeout on. The data type of the property must be int
or timestamp
. An edge type can only specify one field as TTL_COL
. For more information on TTL, see TTL options."},{"location":"3.ngql-guide/11.edge-type-statements/1.create-edge/#examples","title":"Examples","text":"nebula> CREATE EDGE IF NOT EXISTS follow(degree int);\n\n# The following example creates an edge type with no properties.\nnebula> CREATE EDGE IF NOT EXISTS no_property();\n\n# The following example creates an edge type with a default value.\nnebula> CREATE EDGE IF NOT EXISTS follow_with_default(degree int DEFAULT 20);\n\n# In the following example, the TTL of the p2 field is set to be 100 seconds.\nnebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int, p3 timestamp) \\\n TTL_DURATION = 100, TTL_COL = \"p2\";\n
"},{"location":"3.ngql-guide/11.edge-type-statements/2.drop-edge/","title":"DROP EDGE","text":"DROP EDGE
drops an edge type with the given name in a graph space.
An edge can have only one edge type. After you drop it, the edge CANNOT be accessed. The edge will be deleted in the next compaction.
This operation only deletes the Schema data. All the files or directories in the disk will not be deleted directly until the next compaction.
"},{"location":"3.ngql-guide/11.edge-type-statements/2.drop-edge/#prerequisites","title":"Prerequisites","text":"DROP EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
) will be returned. To drop an index, see DROP INDEX.DROP EDGE [IF EXISTS] <edge_type_name>\n
IF NOT EXISTS
: Detects if the edge type that you want to drop exists. Only when it exists will it be dropped.edge_type_name
: Specifies the edge type name that you want to drop. You can drop only one edge type in one statement.nebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int);\nnebula> DROP EDGE e1;\n
"},{"location":"3.ngql-guide/11.edge-type-statements/3.alter-edge/","title":"ALTER EDGE","text":"ALTER EDGE
alters the structure of an edge type with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration.
ALTER EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
will occur when you ALTER EDGE
. For more information on dropping an index, see DROP INDEX.FIXED_STRING
or an INT
can be increased.ALTER EDGE <edge_type_name>\n <alter_definition> [, alter_definition] ...]\n [ttl_definition [, ttl_definition] ... ]\n [COMMENT = '<comment>'];\n\nalter_definition:\n| ADD (prop_name data_type)\n| DROP (prop_name)\n| CHANGE (prop_name data_type)\n\nttl_definition:\n TTL_DURATION = ttl_duration, TTL_COL = prop_name\n
edge_type_name
: Specifies the edge type name that you want to alter. You can alter only one edge type in one statement. Before you alter an edge type, make sure that the edge type exists in the graph space. If the edge type does not exist, an error occurs when you alter it.ADD
, DROP
, and CHANGE
clauses are permitted in a single ALTER EDGE
statement, separated by commas.NOT NULL
using ADD
or CHANGE
, a default value must be specified for the property, that is, the value of DEFAULT
must be specified.nebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int);\nnebula> ALTER EDGE e1 ADD (p3 int, p4 string);\nnebula> ALTER EDGE e1 TTL_DURATION = 2, TTL_COL = \"p2\";\nnebula> ALTER EDGE e1 COMMENT = 'edge1';\n
"},{"location":"3.ngql-guide/11.edge-type-statements/3.alter-edge/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly altered edge type may fail because the alteration of the edge type is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
SHOW EDGES
shows all edge types in the current graph space.
You do not need any privileges for the graph space to run the SHOW EDGES
statement. But the returned results are different based on role privileges.
SHOW EDGES;\n
"},{"location":"3.ngql-guide/11.edge-type-statements/4.show-edges/#example","title":"Example","text":"nebula> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\n
"},{"location":"3.ngql-guide/11.edge-type-statements/5.describe-edge/","title":"DESCRIBE EDGE","text":"DESCRIBE EDGE
returns the information about an edge type with the given name in a graph space, such as field names, data type, and so on.
Running the DESCRIBE EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DESC[RIBE] EDGE <edge_type_name>\n
You can use DESC
instead of DESCRIBE
for short.
nebula> DESCRIBE EDGE follow;\n+----------+---------+-------+---------+---------+\n| Field | Type | Null | Default | Comment |\n+----------+---------+-------+---------+---------+\n| \"degree\" | \"int64\" | \"YES\" | | |\n+----------+---------+-------+---------+---------+\n
"},{"location":"3.ngql-guide/12.vertex-statements/1.insert-vertex/","title":"INSERT VERTEX","text":"The INSERT VERTEX
statement inserts one or more vertices into a graph space in NebulaGraph.
Running the INSERT VERTEX
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES VID: ([prop_value_list])\n\ntag_props:\n tag_name ([prop_name_list])\n\nprop_name_list:\n [prop_name [, prop_name] ...]\n\nprop_value_list:\n [prop_value [, prop_value] ...] \n
IF NOT EXISTS
detects if the VID that you want to insert exists. If it does not exist, a new one will be inserted.
Note
IF NOT EXISTS
only compares the names of the VID and the tag (excluding properties).IF NOT EXISTS
will read to check whether the data exists, which will have a significant impact on performance.tag_name
denotes the tag (vertex type), which must be created before INSERT VERTEX
. For more information, see CREATE TAG.
Caution
NebulaGraph master supports inserting vertices without tags.
Compatibility
In NebulaGraph master, inserting vertex without tag is not supported by default. If you want to use the vertex without tags, add --graph_use_vertex_key=true
to the configuration files (nebula-graphd.conf
) of all Graph services in the cluster, add --use_vertex_key=true
to the configuration files (nebula-storaged.conf
) of all Storage services in the cluster. An example of a command to insert a vertex without tag is INSERT VERTEX VALUES \"1\":();
.
prop_name_list
contains the names of the properties on the tag.VID
is the vertex ID. In NebulaGraph 2.0, string and integer VID types are supported. The VID type is set when a graph space is created. For more information, see CREATE SPACE.prop_value_list
must provide the property values according to the prop_name_list
. When the NOT NULL
constraint is set for a given property, an error is returned if no property is given. When the default value for a property is NULL
, you can omit to specify the property value. For details, see CREATE TAG.Caution
INSERT VERTEX
and CREATE
have different semantics.
INSERT VERTEX
is closer to that of INSERT in NoSQL (key-value), or UPSERT
(UPDATE
or INSERT
) in SQL.IF NOT EXISTS
) with the same VID
and TAG
are operated at the same time, the latter INSERT will overwrite the former.VID
but different TAGS
are operated at the same time, the operation of different tags will not overwrite each other.Examples are as follows.
"},{"location":"3.ngql-guide/12.vertex-statements/1.insert-vertex/#examples","title":"Examples","text":"# Insert a vertex without tag.\nnebula> INSERT VERTEX VALUES \"1\":();\n\n# The following examples create tag t1 with no property and inserts vertex \"10\" with no property.\nnebula> CREATE TAG IF NOT EXISTS t1(); \nnebula> INSERT VERTEX t1() VALUES \"10\":(); \n
nebula> CREATE TAG IF NOT EXISTS t2 (name string, age int); \nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n1\", 12);\n\n# In the following example, the insertion fails because \"a13\" is not int.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"12\":(\"n1\", \"a13\"); \n\n# The following example inserts two vertices at one time.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"13\":(\"n3\", 12), \"14\":(\"n4\", 8); \n
nebula> CREATE TAG IF NOT EXISTS t3(p1 int);\nnebula> CREATE TAG IF NOT EXISTS t4(p2 string);\n\n# The following example inserts vertex \"21\" with two tags.\nnebula> INSERT VERTEX t3 (p1), t4(p2) VALUES \"21\": (321, \"hello\");\n
A vertex can be inserted/written with new values multiple times. Only the last written values can be read.
# The following examples insert vertex \"11\" with new values for multiple times.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n2\", 13);\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n3\", 14);\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n4\", 15);\nnebula> FETCH PROP ON t2 \"11\" YIELD properties(vertex);\n+-----------------------+\n| properties(VERTEX) |\n+-----------------------+\n| {age: 15, name: \"n4\"} |\n+-----------------------+\n
nebula> CREATE TAG IF NOT EXISTS t5(p1 fixed_string(5) NOT NULL, p2 int, p3 int DEFAULT NULL);\nnebula> INSERT VERTEX t5(p1, p2, p3) VALUES \"001\":(\"Abe\", 2, 3);\n\n# In the following example, the insertion fails because the value of p1 cannot be NULL.\nnebula> INSERT VERTEX t5(p1, p2, p3) VALUES \"002\":(NULL, 4, 5);\n[ERROR (-1009)]: SemanticError: No schema found for `t5'\n\n# In the following example, the value of p3 is the default NULL.\nnebula> INSERT VERTEX t5(p1, p2) VALUES \"003\":(\"cd\", 5);\nnebula> FETCH PROP ON t5 \"003\" YIELD properties(vertex);\n+---------------------------------+\n| properties(VERTEX) |\n+---------------------------------+\n| {p1: \"cd\", p2: 5, p3: __NULL__} |\n+---------------------------------+\n\n# In the following example, the allowed maximum length of p1 is 5.\nnebula> INSERT VERTEX t5(p1, p2) VALUES \"004\":(\"shalalalala\", 4);\nnebula> FETCH PROP on t5 \"004\" YIELD properties(vertex);\n+------------------------------------+\n| properties(VERTEX) |\n+------------------------------------+\n| {p1: \"shala\", p2: 4, p3: __NULL__} |\n+------------------------------------+\n
If you insert a vertex that already exists with IF NOT EXISTS
, there will be no modification.
# The following example inserts vertex \"1\".\nnebula> INSERT VERTEX t2 (name, age) VALUES \"1\":(\"n2\", 13);\n# Modify vertex \"1\" with IF NOT EXISTS. But there will be no modification as vertex \"1\" already exists.\nnebula> INSERT VERTEX IF NOT EXISTS t2 (name, age) VALUES \"1\":(\"n3\", 14);\nnebula> FETCH PROP ON t2 \"1\" YIELD properties(vertex);\n+-----------------------+\n| properties(VERTEX) |\n+-----------------------+\n| {age: 13, name: \"n2\"} |\n+-----------------------+\n
"},{"location":"3.ngql-guide/12.vertex-statements/2.update-vertex/","title":"UPDATE VERTEX","text":"The UPDATE VERTEX
statement updates properties on tags of a vertex.
In NebulaGraph, UPDATE VERTEX
supports compare-and-set (CAS).
Note
An UPDATE VERTEX
statement can only update properties on ONE TAG of a vertex.
UPDATE VERTEX ON <tag_name> <vid>\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <tag_name>
Yes Specifies the tag of the vertex. The properties to be updated must be on this tag. ON player
<vid>
Yes Specifies the ID of the vertex to be updated. \"player100\"
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET age = age +1
WHEN <condition>
No Specifies the filter conditions. If <condition>
evaluates to false
, the SET
clause will not take effect. WHEN name == \"Tim\"
YIELD <output>
No Specifies the output format of the statement. YIELD name AS Name
"},{"location":"3.ngql-guide/12.vertex-statements/2.update-vertex/#example","title":"Example","text":"// This query checks the properties of vertex \"player101\".\nnebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 36, name: \"Tony Parker\"} |\n+--------------------------------+\n\n// This query updates the age property and returns name and the new age.\nnebula> UPDATE VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Tony Parker\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/","title":"UPSERT VERTEX","text":"The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT VERTEX
to update the properties of a vertex if it exists or insert a new vertex if it does not exist.
Note
An UPSERT VERTEX
statement can only update the properties on ONE TAG of a vertex.
The performance of UPSERT
is much lower than that of INSERT
because UPSERT
is a read-modify-write serialization operation at the partition level.
Danger
Don't use UPSERT
for scenarios with highly concurrent writes. You can use UPDATE
or INSERT
instead.
UPSERT VERTEX ON <tag> <vid>\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <tag>
Yes Specifies the tag of the vertex. The properties to be updated must be on this tag. ON player
<vid>
Yes Specifies the ID of the vertex to be updated or inserted. \"player100\"
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET age = age +1
WHEN <condition>
No Specifies the filter conditions. WHEN name == \"Tim\"
YIELD <output>
No Specifies the output format of the statement. YIELD name AS Name
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/#insert_a_vertex_if_it_does_not_exist","title":"Insert a vertex if it does not exist","text":"If a vertex does not exist, it is created no matter the conditions in the WHEN
clause are met or not, and the SET
clause always takes effect. The property values of the new vertex depend on:
SET
clause is defined.For example, if:
name
and age
based on the tag player
.SET
clause specifies that age = 30
.Then the property values in different cases are listed as follows:
AreWHEN
conditions met If properties have default values Value of name
Value of age
Yes Yes The default value 30
Yes No NULL
30
No Yes The default value 30
No No NULL
30
Here are some examples:
// This query checks if the following three vertices exist. The result \"Empty set\" indicates that the vertices do not exist.\nnebula> FETCH PROP ON * \"player666\", \"player667\", \"player668\" YIELD properties(vertex);\n+--------------------+\n| properties(VERTEX) |\n+--------------------+\n+--------------------+\nEmpty set\n\nnebula> UPSERT VERTEX ON player \"player666\" \\\n SET age = 30 \\\n WHEN name == \"Joe\" \\\n YIELD name AS Name, age AS Age;\n+----------+----------+\n| Name | Age |\n+----------+----------+\n| __NULL__ | 30 |\n+----------+----------+\n\nnebula> UPSERT VERTEX ON player \"player666\" \\\n SET age = 31 \\\n WHEN name == \"Joe\" \\\n YIELD name AS Name, age AS Age;\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 30 |\n+----------+-----+\n\nnebula> UPSERT VERTEX ON player \"player667\" \\\n SET age = 31 \\\n YIELD name AS Name, age AS Age;\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 31 |\n+----------+-----+\n\nnebula> UPSERT VERTEX ON player \"player668\" \\\n SET name = \"Amber\", age = age + 1 \\\n YIELD name AS Name, age AS Age;\n+---------+----------+\n| Name | Age |\n+---------+----------+\n| \"Amber\" | __NULL__ |\n+---------+----------+\n
In the last query of the preceding examples, since age
has no default value, when the vertex is created, age
is NULL
, and age = age + 1
does not take effect. But if age
has a default value, age = age + 1
will take effect. For example:
nebula> CREATE TAG IF NOT EXISTS player_with_default(name string, age int DEFAULT 20);\nExecution succeeded\n\nnebula> UPSERT VERTEX ON player_with_default \"player101\" \\\n SET age = age + 1 \\\n YIELD name AS Name, age AS Age;\n\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 21 |\n+----------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/#update_a_vertex_if_it_exists","title":"Update a vertex if it exists","text":"If the vertex exists and the WHEN
conditions are met, the vertex is updated.
nebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 36, name: \"Tony Parker\"} |\n+--------------------------------+\n\nnebula> UPSERT VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Tony Parker\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
If the vertex exists and the WHEN
conditions are not met, the update does not take effect.
nebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 38, name: \"Tony Parker\"} |\n+--------------------------------+\n\nnebula> UPSERT VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Someone else\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/4.delete-vertex/","title":"DELETE VERTEX","text":"By default, the DELETE VERTEX
statement deletes vertices but the incoming and outgoing edges of the vertices.
Compatibility
The DELETE VERTEX
statement deletes one vertex or multiple vertices at a time. You can use DELETE VERTEX
together with pipes. For more information about pipe, see Pipe operator.
Note
DELETE VERTEX
deletes vertices directly.DELETE TAG
deletes a tag with the given name on a specified vertex.DELETE VERTEX <vid> [, <vid> ...] [WITH EDGE];\n
This query deletes the vertex whose ID is \"team1\".
# Delete the vertex whose VID is `team1` but the related incoming and outgoing edges are not deleted.\nnebula> DELETE VERTEX \"team1\";\n\n# Delete the vertex whose VID is `team1` and the related incoming and outgoing edges.\nnebula> DELETE VERTEX \"team1\" WITH EDGE;\n
This query shows that you can use DELETE VERTEX
together with pipe to delete vertices.
nebula> GO FROM \"player100\" OVER serve WHERE properties(edge).start_year == \"2021\" YIELD dst(edge) AS id | DELETE VERTEX $-.id;\n
"},{"location":"3.ngql-guide/12.vertex-statements/4.delete-vertex/#process_of_deleting_vertices","title":"Process of deleting vertices","text":"Once NebulaGraph deletes the vertices, all edges (incoming and outgoing edges) of the target vertex will become dangling edges. When NebulaGraph deletes the vertices WITH EDGE
, NebulaGraph traverses the incoming and outgoing edges related to the vertices and deletes them all. Then NebulaGraph deletes the vertices.
Caution
--storage_client_timeout_ms
in nebula-graphd.conf
to extend the timeout period.The INSERT EDGE
statement inserts an edge or multiple edges into a graph space from a source vertex (given by src_vid) to a destination vertex (given by dst_vid) with a specific rank in NebulaGraph.
When inserting an edge that already exists, INSERT EDGE
overrides the edge.
INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES \n<src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> )\n[, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...];\n\n<prop_name_list> ::=\n [ <prop_name> [, <prop_name> ] ...]\n\n<prop_value_list> ::=\n [ <prop_value> [, <prop_value> ] ...]\n
IF NOT EXISTS
detects if the edge that you want to insert exists. If it does not exist, a new one will be inserted.
Note
IF NOT EXISTS
only detects whether exist and does not detect whether the property values overlap. IF NOT EXISTS
will read to check whether the data exists, which will have a significant impact on performance.<edge_type>
denotes the edge type, which must be created before INSERT EDGE
. Only one edge type can be specified in this statement.<prop_name_list>
is the property name list in the given <edge_type>
.src_vid
is the VID of the source vertex. It specifies the start of an edge.dst_vid
is the VID of the destination vertex. It specifies the end of an edge.rank
is optional. It specifies the edge rank of the same edge type. The data type is int
. If not specified, the default value is 0
. You can insert many edges with the same edge type, source vertex, and destination vertex by using different rank values.
OpenCypher compatibility
OpenCypher has no such concept as rank.
<prop_value_list>
must provide the value list according to <prop_name_list>
. If the property values do not match the data type in the edge type, an error is returned. When the NOT NULL
constraint is set for a given property, an error is returned if no property is given. When the default value for a property is NULL
, you can omit to specify the property value. For details, see CREATE EDGE.# The following example creates edge type e1 with no property and inserts an edge from vertex \"10\" to vertex \"11\" with no property.\nnebula> CREATE EDGE IF NOT EXISTS e1(); \nnebula> INSERT EDGE e1 () VALUES \"10\"->\"11\":(); \n\n# The following example inserts an edge from vertex \"10\" to vertex \"11\" with no property. The edge rank is 1.\nnebula> INSERT EDGE e1 () VALUES \"10\"->\"11\"@1:(); \n
nebula> CREATE EDGE IF NOT EXISTS e2 (name string, age int); \nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1);\n\n# The following example creates edge type e2 with two properties.\nnebula> INSERT EDGE e2 (name, age) VALUES \\\n \"12\"->\"13\":(\"n1\", 1), \"13\"->\"14\":(\"n2\", 2); \n\n# In the following example, the insertion fails because \"a13\" is not int.\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", \"a13\");\n
An edge can be inserted/written with property values multiple times. Only the last written values can be read.
The following examples insert edge e2 with the new values for multiple times.\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 12);\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 13);\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 14);\nnebula> FETCH PROP ON e2 \"11\"->\"13\" YIELD edge AS e;\n+-------------------------------------------+\n| e |\n+-------------------------------------------+\n| [:e2 \"11\"->\"13\" @0 {age: 14, name: \"n1\"}] |\n+-------------------------------------------+\n
If you insert an edge that already exists with IF NOT EXISTS
, there will be no modification.
# The following example inserts edge e2 from vertex \"14\" to vertex \"15\".\nnebula> INSERT EDGE e2 (name, age) VALUES \"14\"->\"15\"@1:(\"n1\", 12);\n# The following example alters the edge with IF NOT EXISTS. But there will be no alteration because edge e2 already exists.\nnebula> INSERT EDGE IF NOT EXISTS e2 (name, age) VALUES \"14\"->\"15\"@1:(\"n2\", 13);\nnebula> FETCH PROP ON e2 \"14\"->\"15\"@1 YIELD edge AS e;\n+-------------------------------------------+\n| e |\n+-------------------------------------------+\n| [:e2 \"14\"->\"15\" @1 {age: 12, name: \"n1\"}] |\n+-------------------------------------------+\n
Note
<edgetype>._src
or <edgetype>._dst
(which is not recommended).edge conflict
error, so please try again later.The UPDATE EDGE
statement updates properties on an edge.
In NebulaGraph, UPDATE EDGE
supports compare-and-swap (CAS).
UPDATE EDGE ON <edge_type>\n<src_vid> -> <dst_vid> [@<rank>]\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <edge_type>
Yes Specifies the edge type. The properties to be updated must be on this edge type. ON serve
<src_vid>
Yes Specifies the source vertex ID of the edge. \"player100\"
<dst_vid>
Yes Specifies the destination vertex ID of the edge. \"team204\"
<rank>
No Specifies the rank of the edge. The data type is int
. 10
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET start_year = start_year +1
WHEN <condition>
No Specifies the filter conditions. If <condition>
evaluates to false
, the SET
clause does not take effect. WHEN end_year < 2010
YIELD <output>
No Specifies the output format of the statement. YIELD start_year AS Start_Year
"},{"location":"3.ngql-guide/13.edge-statements/2.update-edge/#example","title":"Example","text":"The following example checks the properties of the edge with the GO statement.
nebula> GO FROM \"player100\" \\\n OVER serve \\\n YIELD properties(edge).start_year, properties(edge).end_year;\n+-----------------------------+---------------------------+\n| properties(EDGE).start_year | properties(EDGE).end_year |\n+-----------------------------+---------------------------+\n| 1997 | 2016 |\n+-----------------------------+---------------------------+\n
The following example updates the start_year
property and returns the end_year
and the new start_year
.
nebula> UPDATE EDGE on serve \"player100\" -> \"team204\"@0 \\\n SET start_year = start_year + 1 \\\n WHEN end_year > 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 1998 | 2016 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/","title":"UPSERT EDGE","text":"The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT EDGE
to update the properties of an edge if it exists or insert a new edge if it does not exist.
The performance of UPSERT
is much lower than that of INSERT
because UPSERT
is a read-modify-write serialization operation at the partition level.
Danger
Do not use UPSERT
for scenarios with highly concurrent writes. You can use UPDATE
or INSERT
instead.
UPSERT EDGE ON <edge_type>\n<src_vid> -> <dst_vid> [@rank]\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <properties>]\n
Parameter Required Description Example ON <edge_type>
Yes Specifies the edge type. The properties to be updated must be on this edge type. ON serve
<src_vid>
Yes Specifies the source vertex ID of the edge. \"player100\"
<dst_vid>
Yes Specifies the destination vertex ID of the edge. \"team204\"
<rank>
No Specifies the rank of the edge. 10
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET start_year = start_year +1
WHEN <condition>
No Specifies the filter conditions. WHEN end_year < 2010
YIELD <output>
No Specifies the output format of the statement. YIELD start_year AS Start_Year
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/#insert_an_edge_if_it_does_not_exist","title":"Insert an edge if it does not exist","text":"If an edge does not exist, it is created no matter the conditions in the WHEN
clause are met or not, and the SET
clause takes effect. The property values of the new edge depend on:
SET
clause is defined.For example, if:
start_year
and end_year
based on the edge type serve
.SET
clause specifies that end_year = 2021
.Then the property values in different cases are listed as follows:
AreWHEN
conditions met If properties have default values Value of start_year
Value of end_year
Yes Yes The default value 2021
Yes No NULL
2021
No Yes The default value 2021
No No NULL
2021
Here are some examples:
// This example checks if the following three vertices have any outgoing serve edge. The result \"Empty set\" indicates that such an edge does not exist.\nnebula> GO FROM \"player666\", \"player667\", \"player668\" \\\n OVER serve \\\n YIELD properties(edge).start_year, properties(edge).end_year;\n+-----------------------------+---------------------------+\n| properties(EDGE).start_year | properties(EDGE).end_year |\n+-----------------------------+---------------------------+\n+-----------------------------+---------------------------+\nEmpty set\n\nnebula> UPSERT EDGE on serve \\\n \"player666\" -> \"team200\"@0 \\\n SET end_year = 2021 \\\n WHEN end_year == 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2021 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player666\" -> \"team200\"@0 \\\n SET end_year = 2022 \\\n WHEN end_year == 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2021 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player667\" -> \"team200\"@0 \\\n SET end_year = 2022 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2022 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player668\" -> \"team200\"@0 \\\n SET start_year = 2000, end_year = end_year + 1 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2000 | __NULL__ |\n+------------+----------+\n
In the last query of the preceding example, since end_year
has no default value, when the edge is created, end_year
is NULL
, and end_year = end_year + 1
does not take effect. But if end_year
has a default value, end_year = end_year + 1
will take effect. For example:
nebula> CREATE EDGE IF NOT EXISTS serve_with_default(start_year int, end_year int DEFAULT 2010);\nExecution succeeded\n\nnebula> UPSERT EDGE on serve_with_default \\\n \"player668\" -> \"team200\" \\\n SET end_year = end_year + 1 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2011 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/#update_an_edge_if_it_exists","title":"Update an edge if it exists","text":"If the edge exists and the WHEN
conditions are met, the edge is updated.
nebula> MATCH (v:player{name:\"Ben Simmons\"})-[e:serve]-(v2) \\\n RETURN e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player149\"->\"team219\" @0 {end_year: 2019, start_year: 2016}] |\n+-----------------------------------------------------------------------+\n\nnebula> UPSERT EDGE on serve \\\n \"player149\" -> \"team219\" \\\n SET end_year = end_year + 1 \\\n WHEN start_year == 2016 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2016 | 2020 |\n+------------+----------+\n
If the edge exists and the WHEN
conditions are not met, the update does not take effect.
nebula> MATCH (v:player{name:\"Ben Simmons\"})-[e:serve]-(v2) \\\n RETURN e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player149\"->\"team219\" @0 {end_year: 2020, start_year: 2016}] |\n+-----------------------------------------------------------------------+\n\n\nnebula> UPSERT EDGE on serve \\\n \"player149\" -> \"team219\" \\\n SET end_year = end_year + 1 \\\n WHEN start_year != 2016 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2016 | 2020 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/","title":"DELETE EDGE","text":"The DELETE EDGE
statement deletes one edge or multiple edges at a time. You can use DELETE EDGE
together with pipe operators. For more information, see PIPE OPERATORS.
To delete all the outgoing edges for a vertex, please delete the vertex. For more information, see DELETE VERTEX.
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/#syntax","title":"Syntax","text":"DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid>[@<rank>] ...]\n
Caution
If no rank is specified, NebulaGraph only deletes the edge with rank 0. Delete edges with all ranks, as shown in the following example.
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/#examples","title":"Examples","text":"nebula> DELETE EDGE serve \"player100\" -> \"team204\"@0;\n
The following example shows that you can use DELETE EDGE
together with pipe operators to delete edges that meet the conditions.
nebula> GO FROM \"player100\" OVER follow \\\n WHERE dst(edge) == \"player101\" \\\n YIELD src(edge) AS src, dst(edge) AS dst, rank(edge) AS rank \\\n | DELETE EDGE follow $-.src->$-.dst @ $-.rank;\n
"},{"location":"3.ngql-guide/14.native-index-statements/","title":"Index overview","text":"Indexes are built to fast process graph queries. Nebula\u00a0Graph supports two kinds of indexes: native indexes and full-text indexes. This topic introduces the index types and helps choose the right index.
"},{"location":"3.ngql-guide/14.native-index-statements/#usage_instructions","title":"Usage Instructions","text":"LOOKUP
statement. If there is no index, an error will be reported when executing the LOOKUP
statement.ID numbers
is 1
), can significantly improve query performance. For indexes with low selectivity (such as country
), query performance might not experience a substantial improvement.Native indexes allow querying data based on a given property. Features are as follows.
REBUILD INDEX
statement to update native indexes.Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property. Features are as follows.
AND
, OR
, and NOT
.Note
To do complete string matches, use native indexes.
"},{"location":"3.ngql-guide/14.native-index-statements/#null_values","title":"Null values","text":"Indexes do not support indexing null values.
"},{"location":"3.ngql-guide/14.native-index-statements/#range_queries","title":"Range queries","text":"In addition to querying single results from native indexes, you can also do range queries. Not all the native indexes support range queries. You can only do range searches for numeric, date, and time type properties.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/","title":"CREATE INDEX","text":""},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#prerequisites","title":"Prerequisites","text":"Before you create an index, make sure that the relative tag or edge type is created. For how to create tags or edge types, see CREATE TAG and CREATE EDGE.
For how to create full-text indexes, see Deploy full-text index.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#must-read_for_using_indexes","title":"Must-read for using indexes","text":"The concept and using restrictions of indexes are comparatively complex. Before you use indexes, you must read the following sections carefully.
You can use CREATE INDEX
to add native indexes for the existing tags, edge types, or properties. They are usually called as tag indexes, edge type indexes, and property indexes.
LOOKUP
to retrieve all the vertices with the tag player
.age
property to retrieve the VID of all vertices that meet age == 19
.If a property index i_TA
is created for the property A
of the tag T
and i_T
for the tag T
, the indexes can be replaced as follows (the same for edge type indexes):
i_TA
to replace i_T
.In the MATCH
and LOOKUP
statements, i_T
may replace i_TA
for querying properties.
Legacy version compatibility
In previous releases, the tag or edge type index in the LOOKUP
statement cannot replace the property index for property queries.
Although the same results can be obtained by using alternative indexes for queries, the query performance varies according to the selected index.
Caution
Indexes can dramatically reduce the write performance. The performance can be greatly reduced. DO NOT use indexes in production environments unless you are fully aware of their influences on your service.
Long indexes decrease the scan performance of the Storage Service and use more memory. We suggest that you set the indexing length the same as that of the longest string to be indexed. For variable-length string-type properties, the longest index length is 256 bytes; for fixed-length string-type properties, the longest index length is the length of the index itself.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#steps","title":"Steps","text":"If you must use indexes, we suggest that you:
Import the data into NebulaGraph.
Create indexes.
Rebuild indexes.
After the index is created and the data is imported, you can use LOOKUP or MATCH to retrieve the data. You do not need to specify which indexes to use in a query, NebulaGraph figures that out by itself.
Note
If you create an index before importing the data, the importing speed will be extremely slow due to the reduction in the write performance.
Keep --disable_auto_compaction = false
during daily incremental writing.
The newly created index will not take effect immediately. Trying to use a newly created index (such as LOOKUP
orREBUILD INDEX
) may fail and return can't find xxx in the space
because the creation is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
in the configuration files for all services.
Danger
After creating a new index, or dropping the old index and creating a new one with the same name again, you must REBUILD INDEX
. Otherwise, these data cannot be returned in the MATCH
and LOOKUP
statements.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT '<comment>'];\n
Parameter Description TAG | EDGE
Specifies the index type that you want to create. IF NOT EXISTS
Detects if the index that you want to create exists. If it does not exist, a new one will be created. <index_name>
1. The name of the index. It must be unique in a graph space. A recommended way of naming is i_tagName_propName
. 2. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number.3. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words.Note:1. If you name an index in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in an index name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <tag_name> | <edge_name>
Specifies the name of the tag or edge associated with the index. <prop_name_list>
To index a variable-length string property, you must use prop_name(length)
to specify the index length, and the maximum index length is 256. To index a tag or an edge type, ignore the prop_name_list
. COMMENT
The remarks of the index. The maximum length is 256 bytes. By default, there will be no comments on an index."},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#create_tagedge_type_indexes","title":"Create tag/edge type indexes","text":"nebula> CREATE TAG INDEX IF NOT EXISTS player_index on player();\n
nebula> CREATE EDGE INDEX IF NOT EXISTS follow_index on follow();\n
After indexing a tag or an edge type, you can use the LOOKUP
statement to retrieve the VID of all vertices with the tag
, or the source vertex ID, destination vertex ID, and ranks
of all edges with the edge type
. For more information, see LOOKUP.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_0 on player(name(10));\n
The preceding example creates an index for the name
property on all vertices carrying the player
tag. This example creates an index using the first 10 characters of the name
property.
# To index a variable-length string property, you need to specify the index length.\nnebula> CREATE TAG IF NOT EXISTS var_string(p1 string);\nnebula> CREATE TAG INDEX IF NOT EXISTS var ON var_string(p1(10));\n\n# To index a fixed-length string property, you do not need to specify the index length.\nnebula> CREATE TAG IF NOT EXISTS fix_string(p1 FIXED_STRING(10));\nnebula> CREATE TAG INDEX IF NOT EXISTS fix ON fix_string(p1);\n
nebula> CREATE EDGE INDEX IF NOT EXISTS follow_index_0 on follow(degree);\n
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#create_composite_property_indexes","title":"Create composite property indexes","text":"An index on multiple properties on a tag (or an edge type) is called a composite property index.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_1 on player(name(10), age);\n
Caution
Creating composite property indexes across multiple tags or edge types is not supported.
Note
NebulaGraph follows the left matching principle to select indexes.
"},{"location":"3.ngql-guide/14.native-index-statements/2.1.show-create-index/","title":"SHOW CREATE INDEX","text":"SHOW CREATE INDEX
shows the statement used when creating a tag or an edge type. It contains detailed information about the index, such as its associated properties.
SHOW CREATE {TAG | EDGE} INDEX <index_name>;\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.1.show-create-index/#examples","title":"Examples","text":"You can run SHOW TAG INDEXES
to list all tag indexes, and then use SHOW CREATE TAG INDEX
to show the information about the creation of the specified index.
nebula> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\n\nnebula> SHOW CREATE TAG INDEX player_index_1;\n+------------------+--------------------------------------------------+\n| Tag Index Name | Create Tag Index |\n+------------------+--------------------------------------------------+\n| \"player_index_1\" | \"CREATE TAG INDEX `player_index_1` ON `player` ( |\n| | `name`(20) |\n| | )\" |\n+------------------+--------------------------------------------------+\n
Edge indexes can be queried through a similar approach.
nebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n+----------------+----------+---------+\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n\nnebula> SHOW CREATE EDGE INDEX follow_index;\n+-----------------+-------------------------------------------------+\n| Edge Index Name | Create Edge Index |\n+-----------------+-------------------------------------------------+\n| \"follow_index\" | \"CREATE EDGE INDEX `follow_index` ON `follow` ( |\n| | )\" |\n+-----------------+-------------------------------------------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.show-native-indexes/","title":"SHOW INDEXES","text":"SHOW INDEXES
shows the defined tag or edge type indexes names in the current graph space.
SHOW {TAG | EDGE} INDEXES\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.show-native-indexes/#examples","title":"Examples","text":"nebula> SHOW TAG INDEXES;\n+------------------+--------------+-----------------+\n| Index Name | By Tag | Columns |\n+------------------+--------------+-----------------+\n| \"fix\" | \"fix_string\" | [\"p1\"] |\n| \"player_index_0\" | \"player\" | [\"name\"] |\n| \"player_index_1\" | \"player\" | [\"name\", \"age\"] |\n| \"var\" | \"var_string\" | [\"p1\"] |\n+------------------+--------------+-----------------+\n\nnebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n
Legacy version compatibility
In NebulaGraph 2.x, the SHOW TAG/EDGE INDEXES
statement only returns Names
.
DESCRIBE INDEX
can get the information about the index with a given name, including the property name (Field) and the property type (Type) of the index.
DESCRIBE {TAG | EDGE} INDEX <index_name>;\n
"},{"location":"3.ngql-guide/14.native-index-statements/3.describe-native-index/#examples","title":"Examples","text":"nebula> DESCRIBE TAG INDEX player_index_0;\n+--------+--------------------+\n| Field | Type |\n+--------+--------------------+\n| \"name\" | \"fixed_string(30)\" |\n+--------+--------------------+\n\nnebula> DESCRIBE TAG INDEX player_index_1;\n+--------+--------------------+\n| Field | Type |\n+--------+--------------------+\n| \"name\" | \"fixed_string(10)\" |\n| \"age\" | \"int64\" |\n+--------+--------------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/4.rebuild-native-index/","title":"REBUILD INDEX","text":"Danger
LOOKUP
and MATCH
to query the data based on the index. If the index is created before any data insertion, there is no need to rebuild the index.You can use REBUILD INDEX
to rebuild the created tag or edge type index. For details on how to create an index, see CREATE INDEX.
Caution
The speed of rebuilding indexes can be optimized by modifying the rebuild_index_part_rate_limit
and snapshot_batch_size
parameters in the configuration file. In addition, greater parameter values may result in higher memory and network usage, see Storage Service configurations for details.
REBUILD {TAG | EDGE} INDEX [<index_name_list>];\n\n<index_name_list>::=\n [index_name [, index_name] ...]\n
REBUILD
statement, separated by commas. When the index name is not specified, all tag or edge indexes are rebuilt.SHOW {TAG | EDGE} INDEX STATUS
command to check if the index is successfully rebuilt. For details on index status, see SHOW INDEX STATUS.nebula> CREATE TAG IF NOT EXISTS person(name string, age int, gender string, email string);\nnebula> CREATE TAG INDEX IF NOT EXISTS single_person_index ON person(name(10));\n\n# The following example rebuilds an index and returns the job ID.\nnebula> REBUILD TAG INDEX single_person_index;\n+------------+\n| New Job Id |\n+------------+\n| 31 |\n+------------+\n\n# The following example checks the index status.\nnebula> SHOW TAG INDEX STATUS;\n+-----------------------+--------------+\n| Name | Index Status |\n+-----------------------+--------------+\n| \"single_person_index\" | \"FINISHED\" |\n+-----------------------+--------------+\n\n# You can also use \"SHOW JOB <job_id>\" to check if the rebuilding process is complete.\nnebula> SHOW JOB 31;\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n| 31 | \"REBUILD_TAG_INDEX\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:24.000 | \"SUCCEEDED\" |\n| 0 | \"storaged1\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| 1 | \"storaged2\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| 2 | \"storaged0\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| \"Total:3\" | \"Succeeded:3\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n
NebulaGraph creates a job to rebuild the index. The job ID is displayed in the preceding return message. To check if the rebuilding process is complete, use the SHOW JOB <job_id>
statement. For more information, see SHOW JOB.
SHOW INDEX STATUS
returns the name of the created tag or edge type index and its status of job.
The status of rebuilding indexes includes:
QUEUE
: The job is in a queue.RUNNING
: The job is running.FINISHED
: The job is finished.FAILED
: The job has failed.STOPPED
: The job has stopped.INVALID
: The job is invalid.Note
For details on how to create an index, see CREATE INDEX.
"},{"location":"3.ngql-guide/14.native-index-statements/5.show-native-index-status/#syntax","title":"Syntax","text":"SHOW {TAG | EDGE} INDEX STATUS;\n
"},{"location":"3.ngql-guide/14.native-index-statements/5.show-native-index-status/#example","title":"Example","text":"nebula> SHOW TAG INDEX STATUS;\n+----------------------+--------------+\n| Name | Index Status |\n+----------------------+--------------+\n| \"player_index_0\" | \"FINISHED\" |\n| \"player_index_1\" | \"FINISHED\" |\n+----------------------+--------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/6.drop-native-index/","title":"DROP INDEX","text":"DROP INDEX
removes an existing index from the current graph space.
Running the DROP INDEX
statement requires some privileges of DROP TAG INDEX
and DROP EDGE INDEX
in the given graph space. Otherwise, NebulaGraph throws an error.
DROP {TAG | EDGE} INDEX [IF EXISTS] <index_name>;\n
IF EXISTS
: Detects whether the index that you want to drop exists. If it exists, it will be dropped.
nebula> DROP TAG INDEX player_index_0;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/","title":"Full-text indexes","text":"Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property.
You can use the WHERE
clause to specify the search strings in LOOKUP
statements.
Before using the full-text index, make sure that you have deployed a Elasticsearch cluster and a Listener cluster. For more information, see Deploy Elasticsearch and Deploy Listener.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#precaution","title":"Precaution","text":"Before using the full-text index, make sure that you know the restrictions.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#full_text_queries","title":"Full Text Queries","text":"Full-text queries enable you to search for parsed text fields, using a parser with strict syntax to return content based on the query string provided. For details, see Query string query.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#syntax","title":"Syntax","text":""},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#create_full-text_indexes","title":"Create full-text indexes","text":"CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER=\"<analyzer_name>\"];\n
<analyzer_name>
is the name of the analyzer. The default value is standard
. To use other analyzers (e.g. IK Analysis), you need to make sure that the corresponding analyzer is installed in Elasticsearch in advance.SHOW FULLTEXT INDEXES;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#rebuild_full-text_indexes","title":"Rebuild full-text indexes","text":"REBUILD FULLTEXT INDEX;\n
Caution
When there is a large amount of data, rebuilding full-text index is slow, you can modify snapshot_send_files=false
in the configuration file of Storage service(nebula-storaged.conf
).
DROP FULLTEXT INDEX <index_name>;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#use_query_options","title":"Use query options","text":"LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, \"<text>\") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>];\n\n<return_list>\n <prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...] [, id(vertex) [AS <prop_alias>]] [, score() AS <score_alias>]\n
index_name
: The name of the full-text index.text
: Search conditions. The where can only be followed by the ES_QUERY, and all judgment conditions must be written in the text. For supported syntax, see Query string syntax.score()
: The score calculated by doing N degree expansion for the eligible vertices. The default value is 1.0
. The higher the score, the higher the degree of match. The return value is sorted by default from highest to lowest score. For details, see Search and Scoring in Lucene.// This example creates the graph space.\nnebula> CREATE SPACE IF NOT EXISTS basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));\n\n// This example signs in the text service.\nnebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);\n\n// This example checks the text service status.\nnebula> SHOW TEXT SEARCH CLIENTS;\n+-----------------+-----------------+------+\n| Type | Host | Port |\n+-----------------+-----------------+------+\n| \"ELASTICSEARCH\" | \"192.168.8.100\" | 9200 |\n+-----------------+-----------------+------+\n\n// This example switches the graph space.\nnebula> USE basketballplayer;\n\n// This example adds the listener to the NebulaGraph cluster.\nnebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789;\n\n// This example checks the listener status. When the status is `Online`, the listener is ready.\nnebula> SHOW LISTENER;\n+--------+-----------------+------------------------+-------------+\n| PartId | Type | Host | Host Status |\n+--------+-----------------+------------------------+-------------+\n| 1 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 2 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 3 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n+--------+-----------------+------------------------+-------------+\n\n// This example creates the tag.\nnebula> CREATE TAG IF NOT EXISTS player(name string, city string);\n\n// This example creates a single-attribute full-text index.\nnebula> CREATE FULLTEXT TAG INDEX fulltext_index_1 ON player(name) ANALYZER=\"standard\";\n\n// This example creates a multi-attribute full-text indexe.\nnebula> CREATE FULLTEXT TAG INDEX fulltext_index_2 ON player(name,city) ANALYZER=\"standard\";\n\n// This example rebuilds the full-text index.\nnebula> REBUILD FULLTEXT INDEX;\n\n// This example shows the full-text index.\nnebula> SHOW FULLTEXT INDEXES;\n+--------------------+-------------+-------------+--------------+------------+\n| Name | Schema Type | Schema Name | Fields | Analyzer |\n+--------------------+-------------+-------------+--------------+------------+\n| \"fulltext_index_1\" | \"Tag\" | \"player\" | \"name\" | \"standard\" |\n| \"fulltext_index_2\" | \"Tag\" | \"player\" | \"name, city\" | \"standard\" |\n+--------------------+-------------+-------------+--------------+------------+\n\n// This example inserts the test data.\nnebula> INSERT VERTEX player(name, city) VALUES \\\n \"Russell Westbrook\": (\"Russell Westbrook\", \"Los Angeles\"), \\\n \"Chris Paul\": (\"Chris Paul\", \"Houston\"),\\\n \"Boris Diaw\": (\"Boris Diaw\", \"Houston\"),\\\n \"David West\": (\"David West\", \"Philadelphia\"),\\\n \"Danny Green\": (\"Danny Green\", \"Philadelphia\"),\\\n \"Tim Duncan\": (\"Tim Duncan\", \"New York\"),\\\n \"James Harden\": (\"James Harden\", \"New York\"),\\\n \"Tony Parker\": (\"Tony Parker\", \"Chicago\"),\\\n \"Aron Baynes\": (\"Aron Baynes\", \"Chicago\"),\\\n \"Ben Simmons\": (\"Ben Simmons\", \"Phoenix\"),\\\n \"Blake Griffin\": (\"Blake Griffin\", \"Phoenix\");\n\n// These examples run test queries.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Chris\") YIELD id(vertex);\n+--------------+\n| id(VERTEX) |\n+--------------+\n| \"Chris Paul\" |\n+--------------+\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Harden\") YIELD properties(vertex);\n+----------------------------------------------------------------+\n| properties(VERTEX) |\n+----------------------------------------------------------------+\n| {_vid: \"James Harden\", city: \"New York\", name: \"James Harden\"} |\n+----------------------------------------------------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Da*\") YIELD properties(vertex);\n+------------------------------------------------------------------+\n| properties(VERTEX) |\n+------------------------------------------------------------------+\n| {_vid: \"David West\", city: \"Philadelphia\", name: \"David West\"} |\n| {_vid: \"Danny Green\", city: \"Philadelphia\", name: \"Danny Green\"} |\n+------------------------------------------------------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex);\n+---------------------+\n| id(VERTEX) |\n+---------------------+\n| \"Russell Westbrook\" |\n| \"Boris Diaw\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Blake Griffin\" |\n+---------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex) | LIMIT 2,3;\n+-----------------+\n| id(VERTEX) |\n+-----------------+\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Blake Griffin\" |\n+-----------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex) | YIELD count(*);\n+----------+\n| count(*) |\n+----------+\n| 5 |\n+----------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex), score() AS score;\n+---------------------+-------+\n| id(VERTEX) | score |\n+---------------------+-------+\n| \"Russell Westbrook\" | 1.0 |\n| \"Boris Diaw\" | 1.0 |\n| \"Aron Baynes\" | 1.0 |\n| \"Ben Simmons\" | 1.0 |\n| \"Blake Griffin\" | 1.0 |\n+---------------------+-------+\n\n// For documents containing a word `b`, its score will be multiplied by a weighting factor of 4, while for documents containing a word `c`, the default weighting factor of 1 is used.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*^4 OR *c*\") YIELD id(vertex), score() AS score;\n+---------------------+-------+\n| id(VERTEX) | score |\n+---------------------+-------+\n| \"Russell Westbrook\" | 4.0 |\n| \"Boris Diaw\" | 4.0 |\n| \"Aron Baynes\" | 4.0 |\n| \"Ben Simmons\" | 4.0 |\n| \"Blake Griffin\" | 4.0 |\n| \"Chris Paul\" | 1.0 |\n| \"Tim Duncan\" | 1.0 |\n+---------------------+-------+\n\n// When using a multi-attribute full-text index query, the conditions are matched within all properties of the index.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,\"*h*\") YIELD properties(vertex);\n+------------------------------------------------------------------+\n| properties(VERTEX) |\n+------------------------------------------------------------------+\n| {_vid: \"Chris Paul\", city: \"Houston\", name: \"Chris Paul\"} |\n| {_vid: \"Boris Diaw\", city: \"Houston\", name: \"Boris Diaw\"} |\n| {_vid: \"David West\", city: \"Philadelphia\", name: \"David West\"} |\n| {_vid: \"James Harden\", city: \"New York\", name: \"James Harden\"} |\n| {_vid: \"Tony Parker\", city: \"Chicago\", name: \"Tony Parker\"} |\n| {_vid: \"Aron Baynes\", city: \"Chicago\", name: \"Aron Baynes\"} |\n| {_vid: \"Ben Simmons\", city: \"Phoenix\", name: \"Ben Simmons\"} |\n| {_vid: \"Blake Griffin\", city: \"Phoenix\", name: \"Blake Griffin\"} |\n| {_vid: \"Danny Green\", city: \"Philadelphia\", name: \"Danny Green\"} |\n+------------------------------------------------------------------+\n\n// When using multi-attribute full-text index queries, you can specify different text for different properties for the query.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,\"name:*b* AND city:Houston\") YIELD properties(vertex);\n+-----------------------------------------------------------+\n| properties(VERTEX) |\n+-----------------------------------------------------------+\n| {_vid: \"Boris Diaw\", city: \"Houston\", name: \"Boris Diaw\"} |\n+-----------------------------------------------------------+\n\n// Delete single-attribute full-text index.\nnebula> DROP FULLTEXT INDEX fulltext_index_1;\n
"},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/","title":"EXPLAIN and PROFILE","text":"EXPLAIN
helps output the execution plan of an nGQL statement without executing the statement.
PROFILE
executes the statement, then outputs the execution plan as well as the execution profile. You can optimize the queries for better performance according to the execution plan and profile.
The execution plan is determined by the execution planner in the NebulaGraph query engine.
The execution planner processes the parsed nGQL statements into actions
. An action
is the smallest unit that can be executed. A typical action
fetches all neighbors of a given vertex, gets the properties of an edge, and filters vertices or edges based on the given conditions. Each action
is assigned to an operator
that performs the action.
For example, a SHOW TAGS
statement is processed into two actions
and assigned to a Start operator
and a ShowTags operator
, while a more complex GO
statement may be processed into more than 10 actions
and assigned to 10 operators.
EXPLAIN
EXPLAIN [format= {\"row\" | \"dot\" | \"tck\"}] <your_nGQL_statement>;\n
PROFILE
PROFILE [format= {\"row\" | \"dot\" | \"tck\"}] <your_nGQL_statement>;\n
The output of an EXPLAIN
or a PROFILE
statement has three formats, the default row
format, the dot
format, and the tck
format. You can use the format
option to modify the output format. Omitting the format
option indicates using the default row
format.
row
format","text":"The row
format outputs the return message in a table as follows.
EXPLAIN
nebula> EXPLAIN format=\"row\" SHOW TAGS;\nExecution succeeded (time spent 327/892 us)\n\nExecution Plan\n\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| id | name | dependencies | profiling data | operator info |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| 1 | ShowTags | 0 | | outputVar: [{\"colNames\":[],\"name\":\"__ShowTags_1\",\"type\":\"DATASET\"}] |\n| | | | | inputVar: |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| 0 | Start | | | outputVar: [{\"colNames\":[],\"name\":\"__Start_0\",\"type\":\"DATASET\"}] |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n
PROFILE
nebula> PROFILE format=\"row\" SHOW TAGS;\n+--------+\n| Name |\n+--------+\n| player |\n+--------+\n| team |\n+--------+\nGot 2 rows (time spent 2038/2728 us)\n\nExecution Plan\n\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| id | name | dependencies | profiling data | operator info |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| 1 | ShowTags | 0 | ver: 0, rows: 1, execTime: 42us, totalTime: 1177us | outputVar: [{\"colNames\":[],\"name\":\"__ShowTags_1\",\"type\":\"DATASET\"}] |\n| | | | | inputVar: |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| 0 | Start | | ver: 0, rows: 0, execTime: 1us, totalTime: 57us | outputVar: [{\"colNames\":[],\"name\":\"__Start_0\",\"type\":\"DATASET\"}] |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n
The descriptions are as follows.
Parameter Descriptionid
The ID of the operator
. name
The name of the operator
. dependencies
The ID of the operator
that the current operator
depends on. profiling data
The content of the execution profile. ver
is the version of the operator
. rows
shows the number of rows to be output by the operator
. execTime
shows the execution time of action
. totalTime
is the sum of the execution time, the system scheduling time, and the queueing time. operator info
The detailed information of the operator
."},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/#the_dot_format","title":"The dot
format","text":"You can use the format=\"dot\"
option to output the return message in the dot
language, and then use Graphviz to generate a graph of the plan.
Note
Graphviz is open source graph visualization software. Graphviz provides an online tool for previewing DOT language files and exporting them to other formats such as SVG or JSON. For more information, see Graphviz Online.
nebula> EXPLAIN format=\"dot\" SHOW TAGS;\nExecution succeeded (time spent 161/665 us)\nExecution Plan\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n plan\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n digraph exec_plan {\n rankdir=LR;\n \"ShowTags_0\"[label=\"ShowTags_0|outputVar: \\[\\{\\\"colNames\\\":\\[\\],\\\"name\\\":\\\"__ShowTags_0\\\",\\\"type\\\":\\\"DATASET\\\"\\}\\]\\l|inputVar:\\l\", shape=Mrecord];\n \"Start_2\"->\"ShowTags_0\";\n \"Start_2\"[label=\"Start_2|outputVar: \\[\\{\\\"colNames\\\":\\[\\],\\\"name\\\":\\\"__Start_2\\\",\\\"type\\\":\\\"DATASET\\\"\\}\\]\\l|inputVar: \\l\", shape=Mrecord];\n }\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n
The Graphviz graph transformed from the above DOT statement is as follows.
"},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/#the_tck_format","title":"Thetck
format","text":"The tck format is similar to a table, but without borders and dividing lines between rows. You can use the results as test cases for unit testing. For information on tck format test cases, see TCK cases.
EXPLAIN
nebula> EXPLAIN format=\"tck\" FETCH PROP ON player \"player_1\",\"player_2\",\"player_3\" YIELD properties(vertex).name as name, properties(vertex).age as age;\nExecution succeeded (time spent 261\u00b5s/613.718\u00b5s)\nExecution Plan (optimize time 28 us)\n| id | name | dependencies | profiling data | operator info |\n| 2 | Project | 1 | | |\n| 1 | GetVertices | 0 | | |\n| 0 | Start | | | |\n\nWed, 22 Mar 2023 23:15:52 CST\n
PROFILE
nebula> PROFILE format=\"tck\" FETCH PROP ON player \"player_1\",\"player_2\",\"player_3\" YIELD properties(vertex).name as name, properties(vertex).age as age;\n| name | age |\n| \"Piter Park\" | 24 |\n| \"aaa\" | 24 |\n| \"ccc\" | 24 |\nGot 3 rows (time spent 1.474ms/2.19677ms)\nExecution Plan (optimize time 41 us)\n| id | name | dependencies | profiling data | operator info |\n| 2 | Project | 1 | {\"rows\":3,\"version\":0} | |\n| 1 | GetVertices | 0 | {\"resp[0]\":{\"exec\":\"232(us)\",\"host\":\"127.0.0.1:9779\",\"total\":\"758(us)\"},\"rows\":3,\"total_rpc\":\"875(us)\",\"version\":0} | |\n| 0 | Start | | {\"rows\":0,\"version\":0} | |\nWed, 22 Mar 2023 23:16:13 CST\n
The KILL SESSION
command is to terminate running sessions.
Note
root
user can terminate sessions.KILL SESSION
command, all Graph services synchronize the latest session information after 2* session_reclaim_interval_secs
seconds (120
seconds by default).You can run the KILL SESSION
command to terminate one or multiple sessions. The syntax is as follows:
To terminate one session
KILL {SESSION|SESSIONS} <SessionId>\n
{SESSION|SESSIONS}
: SESSION
or SESSIONS
, both are supported. <SessionId>
: Specifies the ID of one session. You can run the SHOW SESSIONS command to view the IDs of sessions.To terminate multiple sessions
SHOW SESSIONS \n| YIELD $-.SessionId AS sid [WHERE <filter_clause>]\n| KILL {SESSION|SESSIONS} $-.sid\n
Note
The KILL SESSION
command supports the pipeline operation, combining the SHOW SESSIONS
command with the KILL SESSION
command to terminate multiple sessions.
[WHERE <filter_clause>]
\uff1aWHERE
clause is used to filter sessions. <filter_expression>
specifies a session filtering expression, for example, WHERE $-.CreateTime < datetime(\"2022-12-14T18:00:00\")
. If the WHERE
clause is not specified, all sessions are terminated.WHERE
clause include: SessionId
, UserName
, SpaceName
, CreateTime
, UpdateTime
, GraphAddr
, Timezone
, and ClientIp
. You can run the SHOW SESSIONS command to view descriptions of these conditions.{SESSION|SESSIONS}
: SESSION
or SESSIONS
, both are supported.Caution
Please use filtering conditions with caution to avoid deleting sessions by mistake.
To terminate one session
nebula> KILL SESSION 1672887983842984 \n
To terminate multiple sessions
Terminate all sessions whose creation time is less than 2023-01-05T18:00:00
.
nebula> SHOW SESSIONS | YIELD $-.SessionId AS sid WHERE $-.CreateTime < datetime(\"2023-01-05T18:00:00\") | KILL SESSIONS $-.sid\n
Terminates the two sessions with the earliest creation times.
nebula> SHOW SESSIONS | YIELD $-.SessionId AS sid, $-.CreateTime as CreateTime | ORDER BY $-.CreateTime ASC | LIMIT 2 | KILL SESSIONS $-.sid\n
Terminates all sessions created by the username session_user1
.
nebula> SHOW SESSIONS | YIELD $-.SessionId as sid WHERE $-.UserName == \"session_user1\" | KILL SESSIONS $-.sid\n
Terminate all sessions.
nebula> SHOW SESSIONS | YIELD $-.SessionId as sid | KILL SESSION $-.sid\n\n// Or\nnebula> SHOW SESSIONS | KILL SESSIONS $-.SessionId\n
Caution
When you terminate all sessions, the current session is terminated. Please use it with caution.
KILL QUERY
can terminate the query being executed, and is often used to terminate slow queries.
Note
Users with the God role can kill any query. Other roles can only kill their own queries.
"},{"location":"3.ngql-guide/17.query-tuning-statements/6.kill-query/#syntax","title":"Syntax","text":"KILL QUERY (session=<session_id>, plan=<plan_id>);\n
session_id
: The ID of the session.plan_id
: The ID of the execution plan.The ID of the session and the ID of the execution plan can uniquely determine a query. Both can be obtained through the SHOW QUERIES statement.
"},{"location":"3.ngql-guide/17.query-tuning-statements/6.kill-query/#examples","title":"Examples","text":"This example executes KILL QUERY
in one session to terminate the query in another session.
nebula> KILL QUERY(SESSION=1625553545984255,PLAN=163);\n
The query will be terminated and the following information will be returned.
[ERROR (-1005)]: ExecutionPlanId[1001] does not exist in current Session.\n
"},{"location":"3.ngql-guide/3.data-types/1.numeric/","title":"Numeric types","text":"nGQL supports both integer and floating-point number.
"},{"location":"3.ngql-guide/3.data-types/1.numeric/#integer","title":"Integer","text":"Signed 64-bit integer (INT64), 32-bit integer (INT32), 16-bit integer (INT16), and 8-bit integer (INT8) are supported.
Type Declared keywords Range INT64INT64
orINT
-9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807 INT32 INT32
-2,147,483,648 ~ 2,147,483,647 INT16 INT16
-32,768 ~ 32,767 INT8 INT8
-128 ~ 127"},{"location":"3.ngql-guide/3.data-types/1.numeric/#floating-point_number","title":"Floating-point number","text":"Both single-precision floating-point format (FLOAT) and double-precision floating-point format (DOUBLE) are supported.
Type Declared keywords Range Precision FLOATFLOAT
3.4E +/- 38 6~7 bits DOUBLE DOUBLE
1.7E +/- 308 15~16 bits Scientific notation is also supported, such as 1e2
, 1.1e2
, .3e4
, 1.e4
, and -1234E-10
.
Note
The data type of DECIMAL in MySQL is not supported.
"},{"location":"3.ngql-guide/3.data-types/1.numeric/#reading_and_writing_of_data_values","title":"Reading and writing of data values","text":"When writing and reading different types of data, nGQL complies with the following rules:
Data type Set as VID Set as property Resulted data type INT64 Supported Supported INT64 INT32 Not supported Supported INT64 INT16 Not supported Supported INT64 INT8 Not supported Supported INT64 FLOAT Not supported Supported DOUBLE DOUBLE Not supported Supported DOUBLEFor example, nGQL does not support setting VID as INT8, but supports setting a certain property type of TAG or Edge type as INT8. When using the nGQL statement to read the property of INT8, the resulted type is INT64.
Multiple formats are supported:
123456
.0x1e240
.0361100
.However, NebulaGraph will parse the written non-decimal value into a decimal value and save it. The value read is decimal.
For example, the type of the property score
is INT
. The value of 0xb
is assigned to it through the INSERT statement. If querying the property value with statements such as FETCH, you will get the result 11
, which is the decimal result of the hexadecimal 0xb
.
Geography is a data type composed of latitude and longitude that represents geospatial information. NebulaGraph currently supports Point, LineString, and Polygon in Simple Features and some functions in SQL-MM 3, such as part of the core geo parsing, construction, formatting, conversion, predicates, and dimensions.
"},{"location":"3.ngql-guide/3.data-types/10.geography/#type_description","title":"Type description","text":"A point is the basic data type of geography, which is determined by a latitude and a longitude. For example, \"POINT(3 8)\"
means that the longitude is 3\u00b0
and the latitude is 8\u00b0
. Multiple points can form a linestring or a polygon.
Note
You cannot directly insert geographic data of the following types, such as INSERT VERTEX any_shape(geo) VALUES \"1\":(\"POINT(1 1)\")
. Instead, you need to use a geography function to specify the data type before inserting, such as INSERT VERTEX any_shape(geo) VALUES \"1\":(ST_GeogFromText(\"POINT(1 1)\"));
.
\"POINT(3 8)\"
Specifies the data type as a point. LineString \"LINESTRING(3 8, 4.7 73.23)\"
Specifies the data type as a linestring. Polygon \"POLYGON((0 1, 1 2, 2 3, 0 1))\"
Specifies the data type as a polygon."},{"location":"3.ngql-guide/3.data-types/10.geography/#examples","title":"Examples","text":"//Create a Tag to allow storing any geography data type.\nnebula> CREATE TAG IF NOT EXISTS any_shape(geo geography);\n\n//Create a Tag to allow storing a point only.\nnebula> CREATE TAG IF NOT EXISTS only_point(geo geography(point));\n\n//Create a Tag to allow storing a linestring only.\nnebula> CREATE TAG IF NOT EXISTS only_linestring(geo geography(linestring));\n\n//Create a Tag to allow storing a polygon only.\nnebula> CREATE TAG IF NOT EXISTS only_polygon(geo geography(polygon));\n\n//Create an Edge type to allow storing any geography data type.\nnebula> CREATE EDGE IF NOT EXISTS any_shape_edge(geo geography);\n\n//Create a vertex to store the geography of a polygon.\nnebula> INSERT VERTEX any_shape(geo) VALUES \"103\":(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n\n//Create an edge to store the geography of a polygon.\nnebula> INSERT EDGE any_shape_edge(geo) VALUES \"201\"->\"302\":(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n\n//Query the geography of Vertex 103.\nnebula> FETCH PROP ON any_shape \"103\" YIELD ST_ASText(any_shape.geo);\n+---------------------------------+\n| ST_ASText(any_shape.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n\n//Query the geography of the edge which traverses from Vertex 201 to Vertex 302.\nnebula> FETCH PROP ON any_shape_edge \"201\"->\"302\" YIELD ST_ASText(any_shape_edge.geo);\n+---------------------------------+\n| ST_ASText(any_shape_edge.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n\n//Create an index for the geography of the Tag any_shape and run LOOKUP.\nnebula> CREATE TAG INDEX IF NOT EXISTS any_shape_geo_index ON any_shape(geo);\nnebula> REBUILD TAG INDEX any_shape_geo_index;\nnebula> LOOKUP ON any_shape YIELD ST_ASText(any_shape.geo);\n+---------------------------------+\n| ST_ASText(any_shape.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n
When creating an index for geography properties, you can specify the parameters for the index.
Parameter Default value Descriptions2_max_level
30
The maximum level of S2 cell used in the covering. Allowed values: 1
~30
. Setting it to less than the default means that NebulaGraph will be forced to generate coverings using larger cells. s2_max_cells
8
The maximum number of S2 cells used in the covering. Provides a limit on how much work is done exploring the possible coverings. Allowed values: 1
~30
. You may want to use higher values for odd-shaped regions such as skinny rectangles. Note
Specifying the above two parameters does not affect the Point type of property. The s2_max_level
value of the Point type is forced to be 30
.
nebula> CREATE TAG INDEX IF NOT EXISTS any_shape_geo_index ON any_shape(geo) with (s2_max_level=30, s2_max_cells=8);\n
For more index information, see Index overview.
"},{"location":"3.ngql-guide/3.data-types/2.boolean/","title":"Boolean","text":"A boolean data type is declared with the bool
keyword and can only take the values true
or false
.
nGQL supports using boolean in the following ways:
WHERE
clause.Fixed-length strings and variable-length strings are supported.
"},{"location":"3.ngql-guide/3.data-types/3.string/#declaration_and_literal_representation","title":"Declaration and literal representation","text":"The string type is declared with the keywords of:
STRING
: Variable-length strings.FIXED_STRING(<length>)
: Fixed-length strings. <length>
is the length of the string, such as FIXED_STRING(32)
.A string type is used to store a sequence of characters (text). The literal constant is a sequence of characters of any length surrounded by double or single quotes. For example, \"Hello, Cooper\"
or 'Hello, Cooper'
.
Nebula\u00a0Graph supports using string types in the following ways:
For example:
nebula> CREATE TAG IF NOT EXISTS t1 (p1 FIXED_STRING(10)); \n
nebula> CREATE TAG IF NOT EXISTS t2 (p2 STRING); \n
When the fixed-length string you try to write exceeds the length limit:
In strings, the backslash (\\
) serves as an escape character used to denote special characters.
For example, to include a double quote (\"
) within a string, you cannot directly write \"Hello \"world\"\"
as it leads to a syntax error. Instead, use the backslash (\\
) to escape the double quote, such as \"Hello \\\"world\\\"\"
.
nebula> RETURN \"Hello \\\"world\\\"\"\n+-----------------+\n| \"Hello \"world\"\" |\n+-----------------+\n| \"Hello \"world\"\" |\n+-----------------+\n
The backslash itself needs to be escaped as it's a special character. For example, to include a backslash in a string, you need to write \"Hello \\\\ world\"
.
nebula> RETURN \"Hello \\\\ world\"\n+-----------------+\n| \"Hello \\ world\" |\n+-----------------+\n| \"Hello \\ world\" |\n+-----------------+\n
For more examples of escape characters, see Escape character examples.
"},{"location":"3.ngql-guide/3.data-types/3.string/#opencypher_compatibility","title":"OpenCypher compatibility","text":"There are some tiny differences between openCypher and Cypher, as well as nGQL. The following is what openCypher requires. Single quotes cannot be converted to double quotes.
# File: Literals.feature\nFeature: Literals\n\nBackground:\n Given any graph\n Scenario: Return a single-quoted string\n When executing query:\n \"\"\"\n RETURN '' AS literal\n \"\"\"\n Then the result should be, in any order:\n | literal |\n | '' | # Note: it should return single-quotes as openCypher required.\n And no side effects\n
While Cypher accepts both single quotes and double quotes as the return results. nGQL follows the Cypher way.
nebula > YIELD '' AS quote1, \"\" AS quote2, \"'\" AS quote3, '\"' AS quote4\n+--------+--------+--------+--------+\n| quote1 | quote2 | quote3 | quote4 |\n+--------+--------+--------+--------+\n| \"\" | \"\" | \"'\" | \"\"\" |\n+--------+--------+--------+--------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/","title":"Date and time types","text":"This topic will describe the DATE
, TIME
, DATETIME
, TIMESTAMP
, and DURATION
types.
While inserting time-type property values with DATE
, TIME
, and DATETIME
, NebulaGraph transforms them to a UTC time according to the timezone specified with the timezone_name
parameter in the configuration files.
Note
To change the timezone, modify the timezone_name
value in the configuration files of all NebulaGraph services.
date()
, time()
, and datetime()
can convert a time-type property with a specified timezone. For example, datetime(\"2017-03-04 22:30:40.003000+08:00\")
or datetime(\"2017-03-04T22:30:40.003000[Asia/Shanghai]\")
.date()
, time()
, datetime()
, and timestamp()
all accept empty parameters to return the current date, time, and datetime.date()
, time()
, and datetime()
all accept the property name to return a specific property value of itself. For example, date().month
returns the current month, while time(\"02:59:40\").minute
returns the minutes of the importing time.duration()
to calculate the offset of the moment. Addition and subtraction of date()
and date()
, timestamp()
and timestamp()
are also supported.In nGQL:
localdatetime()
is not supported.YYYY-MM-DDThh:mm:ss
and YYYY-MM-DD hh:mm:ss
.time(\"1:1:1\")
.The DATE
type is used for values with a date part but no time part. Nebula\u00a0Graph retrieves and displays DATE
values in the YYYY-MM-DD
format. The supported range is -32768-01-01
to 32767-12-31
.
The properties of date()
include year
, month
, and day
. date()
supports the input of YYYYY
, YYYYY-MM
or YYYYY-MM-DD
, and defaults to 01
for an untyped month or day.
nebula> RETURN DATE({year:-123, month:12, day:3});\n+------------------------------------+\n| date({year:-(123),month:12,day:3}) |\n+------------------------------------+\n| -123-12-03 |\n+------------------------------------+\n\nnebula> RETURN DATE(\"23333\");\n+---------------+\n| date(\"23333\") |\n+---------------+\n| 23333-01-01 |\n+---------------+\n\nnebula> RETURN DATE(\"2023-12-12\") - DATE(\"2023-12-11\");\n+-----------------------------------------+\n| (date(\"2023-12-12\")-date(\"2023-12-11\")) |\n+-----------------------------------------+\n| 1 |\n+-----------------------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/#time","title":"TIME","text":"The TIME
type is used for values with a time part but no date part. Nebula\u00a0Graph retrieves and displays TIME
values in hh:mm:ss.msmsmsususus
format. The supported range is 00:00:00.000000
to 23:59:59.999999
.
The properties of time()
include hour
, minute
, and second
.
The DATETIME
type is used for values that contain both date and time parts. Nebula\u00a0Graph retrieves and displays DATETIME
values in YYYY-MM-DDThh:mm:ss.msmsmsususus
format. The supported range is -32768-01-01T00:00:00.000000
to 32767-12-31T23:59:59.999999
.
datetime()
include year
, month
, day
, hour
, minute
, and second
.datetime()
can convert TIMESTAMP
to DATETIME
. The value range of TIMESTAMP
is 0~9223372036
.datetime()
supports an int
argument. The int
argument specifies a timestamp.# To get the current date and time.\nnebula> RETURN datetime();\n+----------------------------+\n| datetime() |\n+----------------------------+\n| 2022-08-29T06:37:08.933000 |\n+----------------------------+\n\n# To get the current hour.\nnebula> RETURN datetime().hour;\n+-----------------+\n| datetime().hour |\n+-----------------+\n| 6 |\n+-----------------+\n\n# To get date time from a given timestamp.\nnebula> RETURN datetime(timestamp(1625469277));\n+---------------------------------+\n| datetime(timestamp(1625469277)) |\n+---------------------------------+\n| 2021-07-05T07:14:37.000000 |\n+---------------------------------+\n\nnebula> RETURN datetime(1625469277);\n+----------------------------+\n| datetime(1625469277) |\n+----------------------------+\n| 2021-07-05T07:14:37.000000 |\n+----------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/#timestamp","title":"TIMESTAMP","text":"The TIMESTAMP
data type is used for values that contain both date and time parts. It has a range of 1970-01-01T00:00:01
UTC to 2262-04-11T23:47:16
UTC.
TIMESTAMP
has the following features:
1615974839
, which means 2021-03-17T17:53:59
.TIMESTAMP
querying methods: timestamp and timestamp()
function.TIMESTAMP
inserting methods: timestamp, timestamp()
function, and now()
function.timestamp()
function accepts empty arguments to get the current timestamp. It can pass an integer arguments to identify the integer as a timestamp and the range of passed integer is: 0~9223372036
\u3002timestamp()
function can convert DATETIME
to TIMESTAMP
, and the data type of DATETIME
should be a string
. # To get the current timestamp.\nnebula> RETURN timestamp();\n+-------------+\n| timestamp() |\n+-------------+\n| 1625469277 |\n+-------------+\n\n# To get a timestamp from given date and time.\nnebula> RETURN timestamp(\"2022-01-05T06:18:43\");\n+----------------------------------+\n| timestamp(\"2022-01-05T06:18:43\") |\n+----------------------------------+\n| 1641363523 |\n+----------------------------------+\n\n# To get a timestamp using datetime().\nnebula> RETURN timestamp(datetime(\"2022-08-29T07:53:10.939000\"));\n+---------------------------------------------------+\n| timestamp(datetime(\"2022-08-29T07:53:10.939000\")) |\n+---------------------------------------------------+\n| 1661759590 |\n+---------------------------------------------------+ \n
Note
The date and time format string passed into timestamp()
cannot include any millisecond and microsecond, but the date and time format string passed into timestamp(datetime())
can include a millisecond and a microsecond.
The DURATION
data type is used to indicate a period of time. Map data that are freely combined by years
, months
, days
, hours
, minutes
, and seconds
indicates the DURATION
.
DURATION
has the following features:
DURATION
is not supported.DURATION
can be used to calculate the specified time.Create a tag named date1
with three properties: DATE
, TIME
, and DATETIME
.
nebula> CREATE TAG IF NOT EXISTS date1(p1 date, p2 time, p3 datetime);\n
Insert a vertex named test1
.
nebula> INSERT VERTEX date1(p1, p2, p3) VALUES \"test1\":(date(\"2021-03-17\"), time(\"17:53:59\"), datetime(\"2017-03-04T22:30:40.003000[Asia/Shanghai]\"));\n
Query whether the value of property p1
on the test1
tag is 2021-03-17
.
nebula> MATCH (v:date1) RETURN v.date1.p1 == date(\"2021-03-17\");\n+----------------------------------+\n| (v.date1.p1==date(\"2021-03-17\")) |\n+----------------------------------+\n| true |\n+----------------------------------+\n
Return the content of the property p1
on test1
.
nebula> CREATE TAG INDEX IF NOT EXISTS date1_index ON date1(p1);\nnebula> REBUILD TAG INDEX date1_index;\nnebula> MATCH (v:date1) RETURN v.date1.p1;\n+------------------+\n| v.date1.p1.month |\n+------------------+\n| 3 |\n+------------------+\n
Search for vertices with p3
property values less than 2023-01-01T00:00:00.000000
, and return the p3
values.
nebula> MATCH (v:date1) \\\nWHERE v.date1.p3 < datetime(\"2023-01-01T00:00:00.000000\") \\\nRETURN v.date1.p3;\n+----------------------------+\n| v.date1.p3 |\n+----------------------------+\n| 2017-03-04T14:30:40.003000 |\n+----------------------------+\n
Create a tag named school
with the property of TIMESTAMP
.
nebula> CREATE TAG IF NOT EXISTS school(name string , found_time timestamp);\n
Insert a vertex named DUT
with a found-time timestamp of \"1988-03-01T08:00:00\"
.
# Insert as a timestamp. The corresponding timestamp of 1988-03-01T08:00:00 is 573177600, or 573206400 UTC.\nnebula> INSERT VERTEX school(name, found_time) VALUES \"DUT\":(\"DUT\", 573206400);\n\n# Insert in the form of date and time.\nnebula> INSERT VERTEX school(name, found_time) VALUES \"DUT\":(\"DUT\", timestamp(\"1988-03-01T08:00:00\"));\n
Insert a vertex named dut
and store time with now()
or timestamp()
functions.
# Use now() function to store time\nnebula> INSERT VERTEX school(name, found_time) VALUES \"dut\":(\"dut\", now());\n\n# Use timestamp() function to store time\nnebula> INSERT VERTEX school(name, found_time) VALUES \"dut\":(\"dut\", timestamp());\n
You can also use WITH
statement to set a specific date and time, or to perform calculations. For example:
nebula> WITH time({hour: 12, minute: 31, second: 14, millisecond:111, microsecond: 222}) AS d RETURN d;\n+-----------------+\n| d |\n+-----------------+\n| 12:31:14.111222 |\n+-----------------+\n\nnebula> WITH date({year: 1984, month: 10, day: 11}) AS x RETURN x + 1;\n+------------+\n| (x+1) |\n+------------+\n| 1984-10-12 |\n+------------+\n\nnebula> WITH date('1984-10-11') as x, duration({years: 12, days: 14, hours: 99, minutes: 12}) as d \\\n RETURN x + d AS sum, x - d AS diff;\n+------------+------------+\n| sum | diff |\n+------------+------------+\n| 1996-10-29 | 1972-09-23 |\n+------------+------------+\n
"},{"location":"3.ngql-guide/3.data-types/5.null/","title":"NULL","text":"You can set the properties for vertices or edges to NULL
. Also, you can set the NOT NULL
constraint to make sure that the property values are NOT NULL
. If not specified, the property is set to NULL
by default.
Here is the truth table for AND
, OR
, XOR
, and NOT
.
The comparisons and operations about NULL are different from openCypher. There may be changes later.
"},{"location":"3.ngql-guide/3.data-types/5.null/#comparisons_with_null","title":"Comparisons with NULL","text":"The comparison operations with NULL are incompatible with openCypher.
"},{"location":"3.ngql-guide/3.data-types/5.null/#operations_and_return_with_null","title":"Operations and RETURN with NULL","text":"The NULL operations and RETURN with NULL are incompatible with openCypher.
"},{"location":"3.ngql-guide/3.data-types/5.null/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/3.data-types/5.null/#use_not_null","title":"Use NOT NULL","text":"Create a tag named player
. Specify the property name
as NOT NULL
.
nebula> CREATE TAG IF NOT EXISTS player(name string NOT NULL, age int);\n
Use SHOW
to create tag statements. The property name
is NOT NULL
. The property age
is NULL
by default.
nebula> SHOW CREATE TAG player;\n+-----------+-----------------------------------+\n| Tag | Create Tag |\n+-----------+-----------------------------------+\n| \"student\" | \"CREATE TAG `player` ( |\n| | `name` string NOT NULL, |\n| | `age` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+-----------+-----------------------------------+\n
Insert the vertex Kobe
. The property age
can be NULL
.
nebula> INSERT VERTEX player(name, age) VALUES \"Kobe\":(\"Kobe\",null);\n
"},{"location":"3.ngql-guide/3.data-types/5.null/#use_not_null_and_set_the_default","title":"Use NOT NULL and set the default","text":"Create a tag named player
. Specify the property age
as NOT NULL
. The default value is 18
.
nebula> CREATE TAG IF NOT EXISTS player(name string, age int NOT NULL DEFAULT 18);\n
Insert the vertex Kobe
. Specify the property name
only.
nebula> INSERT VERTEX player(name) VALUES \"Kobe\":(\"Kobe\");\n
Query the vertex Kobe
. The property age
is 18
by default.
nebula> FETCH PROP ON player \"Kobe\" YIELD properties(vertex);\n+--------------------------+\n| properties(VERTEX) |\n+--------------------------+\n| {age: 18, name: \"Kobe\"} |\n+--------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/6.list/","title":"Lists","text":"The list is a composite data type. A list is a sequence of values. Individual elements in a list can be accessed by their positions.
A list starts with a left square bracket [
and ends with a right square bracket ]
. A list contains zero, one, or more expressions. List elements are separated from each other with commas (,
). Whitespace around elements is ignored in the list, thus line breaks, tab stops, and blanks can be used for formatting.
A composite data type (i.e. set, map, and list) CANNOT be stored as properties of vertices or edges.
"},{"location":"3.ngql-guide/3.data-types/6.list/#list_operations","title":"List operations","text":"You can use the preset list function to operate the list, or use the index to filter the elements in the list.
"},{"location":"3.ngql-guide/3.data-types/6.list/#index_syntax","title":"Index syntax","text":"[M]\n[M..N]\n[M..]\n[..N]\n
The index of nGQL supports queries from front to back, starting from 0. 0 means the first element, 1 means the second element, and so on. It also supports queries from back to front, starting from -1. -1 means the last element, -2 means the penultimate element, and so on.
greater or equal to M but smaller than N
. Return empty when N
is 0.greater or equal to M
.smaller than N
. Return empty when N
is 0.Note
M
\u2265N
.M
is null, return BAD_TYPE
. When conducting a range query, if M
or N
is null, return null
.# The following query returns the list [1,2,3].\nnebula> RETURN list[1, 2, 3] AS a;\n+-----------+\n| a |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns the element whose index is 3 in the list [1,2,3,4,5]. In a list, the index starts from 0, and thus the return element is 4.\nnebula> RETURN range(1,5)[3];\n+---------------+\n| range(1,5)[3] |\n+---------------+\n| 4 |\n+---------------+\n\n# The following query returns the element whose index is -2 in the list [1,2,3,4,5]. The index of the last element in a list is -1, and thus the return element is 4.\nnebula> RETURN range(1,5)[-2];\n+------------------+\n| range(1,5)[-(2)] |\n+------------------+\n| 4 |\n+------------------+\n\n# The following query returns the elements whose indexes are from 0 to 3 (not including 3) in the list [1,2,3,4,5].\nnebula> RETURN range(1,5)[0..3];\n+------------------+\n| range(1,5)[0..3] |\n+------------------+\n| [1, 2, 3] |\n+------------------+\n\n# The following query returns the elements whose indexes are greater than 2 in the list [1,2,3,4,5].\nnebula> RETURN range(1,5)[3..] AS a;\n+--------+\n| a |\n+--------+\n| [4, 5] |\n+--------+\n\n# The following query returns the elements whose indexes are smaller than 3.\nnebula> WITH list[1, 2, 3, 4, 5] AS a \\\n RETURN a[..3] AS r;\n+-----------+\n| r |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query filters the elements whose indexes are greater than 2 in the list [1,2,3,4,5], calculate them respectively, and returns them.\nnebula> RETURN [n IN range(1,5) WHERE n > 2 | n + 10] AS a;\n+--------------+\n| a |\n+--------------+\n| [13, 14, 15] |\n+--------------+\n\n# The following query returns the elements from the first to the penultimate (inclusive) in the list [1, 2, 3].\nnebula> YIELD list[1, 2, 3][0..-1] AS a;\n+--------+\n| a |\n+--------+\n| [1, 2] |\n+--------+\n\n# The following query returns the elements from the first (exclusive) to the third backward in the list [1, 2, 3, 4, 5].\nnebula> YIELD list[1, 2, 3, 4, 5][-3..-1] AS a;\n+--------+\n| a |\n+--------+\n| [3, 4] |\n+--------+\n\n# The following query sets the variables and returns the elements whose indexes are 1 and 2.\nnebula> $var = YIELD 1 AS f, 3 AS t; \\\n YIELD list[1, 2, 3][$var.f..$var.t] AS a;\n+--------+\n| a |\n+--------+\n| [2, 3] |\n+--------+\n\n# The following query returns empty because the index is out of bound. It will return normally when the index is within the bound.\nnebula> RETURN list[1, 2, 3, 4, 5] [0..10] AS a;\n+-----------------+\n| a |\n+-----------------+\n| [1, 2, 3, 4, 5] |\n+-----------------+\n\nnebula> RETURN list[1, 2, 3] [-5..5] AS a;\n+-----------+\n| a |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns empty because there is a [0..0].\nnebula> RETURN list[1, 2, 3, 4, 5] [0..0] AS a;\n+----+\n| a |\n+----+\n| [] |\n+----+\n\n# The following query returns empty because of M \u2265 N.\nnebula> RETURN list[1, 2, 3, 4, 5] [3..1] AS a;\n+----+\n| a |\n+----+\n| [] |\n+----+\n\n# When conduct a range query, if `M` or `N` is null, return `null`.\nnebula> WITH list[1,2,3] AS a \\\n RETURN a[0..null] as r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\n# The following query calculates the elements in the list [1,2,3,4,5] respectively and returns them without the list head.\nnebula> RETURN tail([n IN range(1, 5) | 2 * n - 10]) AS a;\n+-----------------+\n| a |\n+-----------------+\n| [-6, -4, -2, 0] |\n+-----------------+\n\n# The following query takes the elements in the list [1,2,3] as true and return.\nnebula> RETURN [n IN range(1, 3) WHERE true | n] AS r;\n+-----------+\n| r |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns the length of the list [1,2,3].\nnebula> RETURN size(list[1,2,3]);\n+-------------------+\n| size(list[1,2,3]) |\n+-------------------+\n| 3 |\n+-------------------+\n\n# The following query calculates the elements in the list [92,90] and runs a conditional judgment in a where clause.\nnebula> GO FROM \"player100\" OVER follow WHERE properties(edge).degree NOT IN [x IN [92, 90] | x + $$.player.age] \\\n YIELD dst(edge) AS id, properties(edge).degree AS degree;\n+-------------+--------+\n| id | degree |\n+-------------+--------+\n| \"player101\" | 95 |\n| \"player102\" | 90 |\n+-------------+--------+\n\n# The following query takes the query result of the MATCH statement as the elements in a list. Then it calculates and returns them.\nnebula> MATCH p = (n:player{name:\"Tim Duncan\"})-[:follow]->(m) \\\n RETURN [n IN nodes(p) | n.player.age + 100] AS r;\n+------------+\n| r |\n+------------+\n| [142, 136] |\n| [142, 141] |\n+------------+\n
"},{"location":"3.ngql-guide/3.data-types/6.list/#opencypher_compatibility_1","title":"OpenCypher compatibility","text":"null
when querying a single out-of-bound element. However, in nGQL, return OUT_OF_RANGE
when querying a single out-of-bound element.nebula> RETURN range(0,5)[-12];\n+-------------------+\n| range(0,5)[-(12)] |\n+-------------------+\n| OUT_OF_RANGE |\n+-------------------+\n
A composite data type (i.e., set, map, and list) CAN NOT be stored as properties for vertices or edges.
It is recommended to modify the graph modeling method. The composite data type should be modeled as an adjacent edge of a vertex, rather than its property. Each adjacent edge can be dynamically added or deleted. The rank values of the adjacent edges can be used for sequencing.
[(src)-[]->(m) | m.name]
.The set is a composite data type. A set is a set of values. Unlike a List, values in a set are unordered and each value must be unique.
A set starts with a left curly bracket {
and ends with a right curly bracket }
. A set contains zero, one, or more expressions. Set elements are separated from each other with commas (,
). Whitespace around elements is ignored in the set, thus line breaks, tab stops, and blanks can be used for formatting.
# The following query returns the set {1,2,3}.\nnebula> RETURN set{1, 2, 3} AS a;\n+-----------+\n| a |\n+-----------+\n| {3, 2, 1} |\n+-----------+\n\n# The following query returns the set {1,2}, Because the set does not allow repeating elements, and the order is unordered.\nnebula> RETURN set{1, 2, 1} AS a;\n+--------+\n| a |\n+--------+\n| {2, 1} |\n+--------+\n\n# The following query checks whether the set has the specified element 1.\nnebula> RETURN 1 IN set{1, 2} AS a;\n+------+\n| a |\n+------+\n| true |\n+------+\n\n# The following query counts the number of elements in the set.\nnebula> YIELD size(set{1, 2, 1}) AS a;\n+---+\n| a |\n+---+\n| 2 |\n+---+\n\n# The following query returns a set of target vertex property values.\nnebula> GO FROM \"player100\" OVER follow \\\n YIELD set{properties($$).name,properties($$).age} as a;\n+-----------------------+\n| a |\n+-----------------------+\n| {36, \"Tony Parker\"} |\n| {41, \"Manu Ginobili\"} |\n+-----------------------+\n
"},{"location":"3.ngql-guide/3.data-types/8.map/","title":"Maps","text":"The map is a composite data type. Maps are unordered collections of key-value pairs. In maps, the key is a string. The value can have any data type. You can get the map element by using map['key']
.
A map starts with a left curly bracket {
and ends with a right curly bracket }
. A map contains zero, one, or more key-value pairs. Map elements are separated from each other with commas (,
). Whitespace around elements is ignored in the map, thus line breaks, tab stops, and blanks can be used for formatting.
# The following query returns the simple map.\nnebula> YIELD map{key1: 'Value1', Key2: 'Value2'} as a;\n+----------------------------------+\n| a |\n+----------------------------------+\n| {Key2: \"Value2\", key1: \"Value1\"} |\n+----------------------------------+\n\n# The following query returns the list type map.\nnebula> YIELD map{listKey: [{inner: 'Map1'}, {inner: 'Map2'}]} as a;\n+-----------------------------------------------+\n| a |\n+-----------------------------------------------+\n| {listKey: [{inner: \"Map1\"}, {inner: \"Map2\"}]} |\n+-----------------------------------------------+\n\n# The following query returns the hybrid type map.\nnebula> RETURN map{a: LIST[1,2], b: SET{1,2,1}, c: \"hee\"} as a;\n+----------------------------------+\n| a |\n+----------------------------------+\n| {a: [1, 2], b: {2, 1}, c: \"hee\"} |\n+----------------------------------+\n\n# The following query returns the specified element in a map.\nnebula> RETURN map{a: LIST[1,2], b: SET{1,2,1}, c: \"hee\"}[\"b\"] AS b;\n+--------+\n| b |\n+--------+\n| {2, 1} |\n+--------+\n\n# The following query checks whether the map has the specified key, not support checks whether the map has the specified value yet.\nnebula> RETURN \"a\" IN MAP{a:1, b:2} AS a;\n+------+\n| a |\n+------+\n| true |\n+------+\n
"},{"location":"3.ngql-guide/3.data-types/9.type-conversion/","title":"Type Conversion/Type coercions","text":"Converting an expression of a given type to another type is known as type conversion.
NebulaGraph supports converting expressions explicit to other types. For details, see Type conversion functions.
"},{"location":"3.ngql-guide/3.data-types/9.type-conversion/#examples","title":"Examples","text":"nebula> UNWIND [true, false, 'true', 'false', NULL] AS b \\\n RETURN toBoolean(b) AS b;\n+----------+\n| b |\n+----------+\n| true |\n| false |\n| true |\n| false |\n| __NULL__ |\n+----------+\n\nnebula> RETURN toFloat(1), toFloat('1.3'), toFloat('1e3'), toFloat('not a number');\n+------------+----------------+----------------+-------------------------+\n| toFloat(1) | toFloat(\"1.3\") | toFloat(\"1e3\") | toFloat(\"not a number\") |\n+------------+----------------+----------------+-------------------------+\n| 1.0 | 1.3 | 1000.0 | __NULL__ |\n+------------+----------------+----------------+-------------------------+\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/1.composite-queries/","title":"Composite queries (clause structure)","text":"Composite queries put data from different queries together. They then use filters, group-bys, or sorting before returning the combined return results.
Nebula\u00a0Graph supports three methods to run composite queries (or sub-queries):
|
). The result of the previous query can be used as the input of the next query.In a composite query, do not put together openCypher and native nGQL clauses in one statement. For example, this statement is undefined: MATCH ... | GO ... | YIELD ...
.
MATCH
, RETURN
, WITH
, etc), do not introduce any pipe or semicolons to combine the sub-clauses.FETCH
, GO
, LOOKUP
, etc), you must use pipe or semicolons to combine the sub-clauses.transactional
queries (as in SQL/Cypher)","text":"For example, a query is composed of three sub-queries: A B C
, A | B | C
or A; B; C
. In that A is a read operation, B is a computation operation, and C is a write operation. If any part fails in the execution, the whole result will be undefined. There is no rollback. What is written depends on the query executor.
Note
OpenCypher has no requirement of transaction
.
# Connect multiple queries with clauses.\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})--() \\\n WITH nodes(p) AS n \\\n UNWIND n AS n1 \\\n RETURN DISTINCT n1;\n
# Only return edges.\nnebula> SHOW TAGS; SHOW EDGES;\n\n# Insert multiple vertices.\nnebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42); \\\n INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36); \\\n INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\n
# Connect multiple queries with pipes.\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------+-----------------+\n| Team | Player |\n+-----------+-----------------+\n| \"Spurs\" | \"Tony Parker\" |\n| \"Hornets\" | \"Tony Parker\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------+-----------------+\n
User-defined variables allow passing the result of one statement to another.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/2.user-defined-variables/#opencypher_compatibility","title":"OpenCypher compatibility","text":"In openCypher, when you refer to the vertex, edge, or path of a variable, you need to name it first. For example:
nebula> MATCH (v:player{name:\"Tim Duncan\"}) RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{name: \"Tim Duncan\", age: 42}) |\n+----------------------------------------------------+\n
The user-defined variable in the preceding query is v
.
Caution
In a pattern of a MATCH statement, you cannot use the same edge variable repeatedly. For example, e
cannot be written in the pattern p=(v1)-[e*2..2]->(v2)-[e*2..2]->(v3)
.
User-defined variables are written as $var_name
. The var_name
consists of letters, numbers, or underline characters. Any other characters are not permitted.
The user-defined variables are valid only at the current execution (namely, in this composite query). When the execution ends, the user-defined variables will be automatically expired. The user-defined variables in one statement CANNOT be used in any other clients, executions, or sessions.
You can use user-defined variables in composite queries. Details about composite queries, see Composite queries.
Note
nebula> $var = GO FROM \"player100\" OVER follow YIELD dst(edge) AS id; \\\n GO FROM $var.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------+-----------------+\n| Team | Player |\n+-----------+-----------------+\n| \"Spurs\" | \"Tony Parker\" |\n| \"Hornets\" | \"Tony Parker\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------+-----------------+\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/2.user-defined-variables/#set_operations_and_scope_of_user-defined_variables","title":"Set operations and scope of user-defined variables","text":"When assigning variables within a compound statement involving set operations, it is important to enclose the scope of the variable assignment in parentheses. In the example below, the source of the $var
assignment is the results of the output of two INTERSECT
statements.
$var = ( \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id \\\n INTERSECT \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id \\\n ); \\\n GO FROM $var.id OVER follow YIELD follow.degree AS degree\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/","title":"Reference to properties","text":"nGQL provides property references to allow you to refer to the properties of the source vertex, the destination vertex, and the edge in the GO
statement, and to refer to the output results of the statement in composite queries. This topic describes how to use these property references in nGQL.
Note
This function applies to native nGQL only.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_references_for_vertexes","title":"Property references for vertexes","text":"Parameter Description$^
Used to get the property of the source vertex. $$
Used to get the property of the destination vertex."},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_reference_syntax","title":"Property reference syntax","text":"$^.<tag_name>.<prop_name> # Source vertex property reference\n$$.<tag_name>.<prop_name> # Destination vertex property reference\n
tag_name
: The tag name of the vertex.prop_name
: The property name within the tag._src
The source vertex ID of the edge _dst
The destination vertex ID of the edge _type
The internal encoding of edge types that uses sign to indicate direction. Positive numbers represent forward edges, while negative numbers represent backward edges. _rank
The rank value for the edge"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_reference_syntax_1","title":"Property reference syntax","text":"nGQL allows you to reference edge properties, including user-defined edge properties and four built-in edge properties.
<edge_type>.<prop_name> # User-defined edge property reference\n<edge_type>._src|_dst|_type|_rank # Built-in edge property reference\n
edge_type
: The edge type.prop_name
: The property name within the edge type.$-
Used to get the output results of the statement before the pipe in the composite query. For more information, see Pipe."},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_vertexes","title":"Use property references for vertexes","text":"The following query returns the name
property of the player
tag on the source vertex and the age
property of the player
tag on the destination vertex.
nebula> GO FROM \"player100\" OVER follow YIELD $^.player.name AS startName, $$.player.age AS endAge;\n+--------------+--------+\n| startName | endAge |\n+--------------+--------+\n| \"Tim Duncan\" | 36 |\n| \"Tim Duncan\" | 41 |\n+--------------+--------+\n
Legacy version compatibility
Starting from NebulaGraph 2.6.0, Schema-related functions are supported. The preceding example can be rewritten as follows in NebulaGraph master to produce the same results:
GO FROM \"player100\" OVER follow YIELD properties($^).name AS startName, properties($$).age AS endAge;\n
NebulaGraph master is compatible with both new and old syntax.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_edges","title":"Use property references for edges","text":"The following query returns the degree
property of the edge type follow
.
nebula> GO FROM \"player100\" OVER follow YIELD follow.degree;\n+---------------+\n| follow.degree |\n+---------------+\n| 95 |\n+---------------+\n
The following query returns the source vertex, the destination vertex, the edge type, and the edge rank value of the edge type follow
.
nebula> GO FROM \"player100\" OVER follow YIELD follow._src, follow._dst, follow._type, follow._rank;\n+-------------+-------------+--------------+--------------+\n| follow._src | follow._dst | follow._type | follow._rank |\n+-------------+-------------+--------------+--------------+\n| \"player100\" | \"player101\" | 17 | 0 |\n| \"player100\" | \"player125\" | 17 | 0 |\n+-------------+-------------+--------------+--------------+\n
Legacy version compatibility
Starting from NebulaGraph 2.6.0, Schema-related functions are supported. The preceding example can be rewritten as follows in NebulaGraph master to produce the same results:
GO FROM \"player100\" OVER follow YIELD properties(edge).degree;\nGO FROM \"player100\" OVER follow YIELD src(edge), dst(edge), type(edge), rank(edge);\n
NebulaGraph master is compatible with both new and old syntax.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_composite_queries","title":"Use property references for composite queries","text":"The following composite query performs the following actions:
$-.id
to get the results of the statement GO FROM \"player100\" OVER follow YIELD dst(edge) AS id
, which returns the destination vertex ID of the follow
edge type.properties($^)
function to get the name property of the player tag on the source vertex of the serve
edge type.properties($$)
function to get the name property of the team tag on the destination vertex of the serve
edge type.nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve \\\n YIELD properties($^).name AS Player, properties($$).name AS Team;\n+-----------------+-----------+\n| Player | Team |\n+-----------------+-----------+\n| \"Tony Parker\" | \"Spurs\" |\n| \"Tony Parker\" | \"Hornets\" |\n| \"Manu Ginobili\" | \"Spurs\" |\n+-----------------+-----------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/","title":"Comparison operators","text":"NebulaGraph supports the following comparison operators.
Name Description==
Equal operator !=
, <>
Not equal operator >
Greater than operator >=
Greater than or equal operator <
Less than operator <=
Less than or equal operator IS NULL
NULL check IS NOT NULL
Not NULL check IS EMPTY
EMPTY check IS NOT EMPTY
Not EMPTY check The result of the comparison operation is true
or false
.
Note
NULL
or others.EMPTY
is currently used only for checking, and does not support functions or operations such as GROUP BY
, count()
, sum()
, max()
, hash()
, collect()
, +
or *
.openCypher does not have EMPTY
. Thus EMPTY
is not supported in MATCH statements.
==
","text":"String comparisons are case-sensitive. Values of different types are not equal.
Note
The equal operator is ==
in nGQL, while in openCypher it is =
.
nebula> RETURN 'A' == 'a', toUpper('A') == toUpper('a'), toLower('A') == toLower('a');\n+------------+------------------------------+------------------------------+\n| (\"A\"==\"a\") | (toUpper(\"A\")==toUpper(\"a\")) | (toLower(\"A\")==toLower(\"a\")) |\n+------------+------------------------------+------------------------------+\n| false | true | true |\n+------------+------------------------------+------------------------------+\n\nnebula> RETURN '2' == 2, toInteger('2') == 2;\n+----------+---------------------+\n| (\"2\"==2) | (toInteger(\"2\")==2) |\n+----------+---------------------+\n| false | true |\n+----------+---------------------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_2","title":">
","text":"nebula> RETURN 3 > 2;\n+-------+\n| (3>2) |\n+-------+\n| true |\n+-------+\n\nnebula> WITH 4 AS one, 3 AS two \\\n RETURN one > two AS result;\n+--------+\n| result |\n+--------+\n| true |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_3","title":">=
","text":"nebula> RETURN 2 >= \"2\", 2 >= 2;\n+----------+--------+\n| (2>=\"2\") | (2>=2) |\n+----------+--------+\n| __NULL__ | true |\n+----------+--------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_4","title":"<
","text":"nebula> YIELD 2.0 < 1.9;\n+---------+\n| (2<1.9) |\n+---------+\n| false |\n+---------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_5","title":"<=
","text":"nebula> YIELD 0.11 <= 0.11;\n+--------------+\n| (0.11<=0.11) |\n+--------------+\n| true |\n+--------------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_6","title":"!=
","text":"nebula> YIELD 1 != '1';\n+----------+\n| (1!=\"1\") |\n+----------+\n| true |\n+----------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#is_not_null","title":"IS [NOT] NULL
","text":"nebula> RETURN null IS NULL AS value1, null == null AS value2, null != null AS value3;\n+--------+----------+----------+\n| value1 | value2 | value3 |\n+--------+----------+----------+\n| true | __NULL__ | __NULL__ |\n+--------+----------+----------+\n\nnebula> RETURN length(NULL), size(NULL), count(NULL), NULL IS NULL, NULL IS NOT NULL, sin(NULL), NULL + NULL, [1, NULL] IS NULL;\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n| length(NULL) | size(NULL) | count(NULL) | NULL IS NULL | NULL IS NOT NULL | sin(NULL) | (NULL+NULL) | [1,NULL] IS NULL |\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n| __NULL__ | __NULL__ | 0 | true | false | __NULL__ | __NULL__ | false |\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n\nnebula> WITH {name: null} AS `map` \\\n RETURN `map`.name IS NOT NULL;\n+----------------------+\n| map.name IS NOT NULL |\n+----------------------+\n| false |\n+----------------------+\n\nnebula> WITH {name: 'Mats', name2: 'Pontus'} AS map1, \\\n {name: null} AS map2, {notName: 0, notName2: null } AS map3 \\\n RETURN map1.name IS NULL, map2.name IS NOT NULL, map3.name IS NULL;\n+-------------------+-----------------------+-------------------+\n| map1.name IS NULL | map2.name IS NOT NULL | map3.name IS NULL |\n+-------------------+-----------------------+-------------------+\n| false | false | true |\n+-------------------+-----------------------+-------------------+\n\nnebula> MATCH (n:player) \\\n RETURN n.player.age IS NULL, n.player.name IS NOT NULL, n.player.empty IS NULL;\n+----------------------+---------------------------+------------------------+\n| n.player.age IS NULL | n.player.name IS NOT NULL | n.player.empty IS NULL |\n+----------------------+---------------------------+------------------------+\n| false | true | true |\n| false | true | true |\n...\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#is_not_empty","title":"IS [NOT] EMPTY
","text":"nebula> RETURN null IS EMPTY;\n+---------------+\n| NULL IS EMPTY |\n+---------------+\n| false |\n+---------------+\n\nnebula> RETURN \"a\" IS NOT EMPTY;\n+------------------+\n| \"a\" IS NOT EMPTY |\n+------------------+\n| true |\n+------------------+\n\nnebula> GO FROM \"player100\" OVER * WHERE properties($$).name IS NOT EMPTY YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"team204\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
"},{"location":"3.ngql-guide/5.operators/10.arithmetic/","title":"Arithmetic operators","text":"NebulaGraph supports the following arithmetic operators.
Name Description+
Addition operator -
Minus operator *
Multiplication operator /
Division operator %
Modulo operator -
Changes the sign of the argument"},{"location":"3.ngql-guide/5.operators/10.arithmetic/#examples","title":"Examples","text":"nebula> RETURN 1+2 AS result;\n+--------+\n| result |\n+--------+\n| 3 |\n+--------+\n\nnebula> RETURN -10+5 AS result;\n+--------+\n| result |\n+--------+\n| -5 |\n+--------+\n\nnebula> RETURN (3*8)%5 AS result;\n+--------+\n| result |\n+--------+\n| 4 |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/2.boolean/","title":"Boolean operators","text":"NebulaGraph supports the following boolean operators.
Name Description AND Logical AND NOT Logical NOT OR Logical OR XOR Logical XORFor the precedence of the operators, refer to Operator Precedence.
For the logical operations with NULL
, refer to NULL.
Multiple queries can be combined using pipe operators in nGQL.
"},{"location":"3.ngql-guide/5.operators/4.pipe/#opencypher_compatibility","title":"OpenCypher compatibility","text":"Pipe operators apply to native nGQL only.
"},{"location":"3.ngql-guide/5.operators/4.pipe/#syntax","title":"Syntax","text":"One major difference between nGQL and SQL is how sub-queries are composed.
PIPE (|)
is introduced into the sub-queries.nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dstid, properties($$).name AS Name | \\\n GO FROM $-.dstid OVER follow YIELD dst(edge);\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n| \"player100\" |\n+-------------+\n
Users must define aliases in the YIELD
clause for the reference operator $-
to use, just like $-.dstid
in the preceding example.
In NebulaGraph, pipes will affect the performance. Take A | B
as an example, the effects are as follows:
Pipe operators operate synchronously. That is, the data can enter the pipe clause as a whole after the execution of clause A
before the pipe operator is completed.
If A
sends a large amount of data to |
, the entire query request may be very slow. You can try to split this statement.
Send A
from the application,
Split the return results on the application,
Send to multiple graphd processes concurrently,
Every graphd process executes part of B.
This is usually much faster than executing a complete A | B
with a single graphd process.
This topic will describe the set operators, including UNION
, UNION ALL
, INTERSECT
, and MINUS
. To combine multiple queries, use these set operators.
All set operators have equal precedence. If a nGQL statement contains multiple set operators, NebulaGraph will evaluate them from left to right unless parentheses explicitly specify another order.
Caution
The names and order of the variables defined in the query statements before and after the set operator must be consistent. For example, the names and order of a,b,c
in RETURN a,b,c UNION RETURN a,b,c
need to be consistent.
<left> UNION [DISTINCT | ALL] <right> [ UNION [DISTINCT | ALL] <right> ...]\n
UNION DISTINCT
(or by short UNION
) returns the union of two sets A and B without duplicated elements.UNION ALL
returns the union of two sets A and B with duplicated elements.<left>
and <right>
must have the same number of columns and data types. Different data types are converted according to the Type Conversion.# The following statement returns the union of two query results without duplicated elements.\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) \\\n UNION \\\n GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n\nnebula> MATCH (v:player) \\\n WITH v.player.name AS v \\\n RETURN n ORDER BY n LIMIT 3 \\\n UNION \\\n UNWIND [\"Tony Parker\", \"Ben Simmons\"] AS n \\\n RETURN n;\n+---------------------+\n| n |\n+---------------------+\n| \"Amar'e Stoudemire\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Tony Parker\" |\n+---------------------+\n\n# The following statement returns the union of two query results with duplicated elements.\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) \\\n UNION ALL \\\n GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n\nnebula> MATCH (v:player) \\\n WITH v.player.name AS n \\\n RETURN n ORDER BY n LIMIT 3 \\\n UNION ALL \\\n UNWIND [\"Tony Parker\", \"Ben Simmons\"] AS n \\\n RETURN n;\n+---------------------+\n| n |\n+---------------------+\n| \"Amar'e Stoudemire\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Tony Parker\" |\n| \"Ben Simmons\" |\n+---------------------+\n\n# UNION can also work with the YIELD statement. The DISTINCT keyword will check duplication by all the columns for every line, and remove duplicated lines if every column is the same.\nnebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age \\\n UNION /* DISTINCT */ \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age;\n+-------------+--------+-----+\n| id | Degree | Age |\n+-------------+--------+-----+\n| \"player100\" | 75 | 42 |\n| \"player101\" | 75 | 36 |\n| \"player101\" | 95 | 36 |\n| \"player125\" | 95 | 41 |\n+-------------+--------+-----+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#intersect","title":"INTERSECT","text":"<left> INTERSECT <right>\n
INTERSECT
returns the intersection of two sets A and B (denoted by A \u22c2 B).UNION
, the left
and right
must have the same number of columns and data types. Different data types are converted according to the Type Conversion.# The following statement returns the intersection of two query results.\nnebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age \\\n INTERSECT \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age;\n+----+--------+-----+\n| id | Degree | Age |\n+----+--------+-----+\n+----+--------+-----+\n\nnebula> MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) == \"player102\" \\\n RETURN id(v2) As id, e.degree As Degree, v2.player.age AS Age \\\n INTERSECT \\\n MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) == \"player100\" \\\n RETURN id(v2) As id, e.degree As Degree, v2.player.age AS Age;\n+----+--------+-----+\n| id | Degree | Age |\n+----+--------+-----+\n+----+--------+-----+\n\nnebula> UNWIND [1,2] AS a RETURN a \\\n INTERSECT \\\n UNWIND [1,2,3,4] AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n+---+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#minus","title":"MINUS","text":"<left> MINUS <right>\n
Operator MINUS
returns the subtraction (or difference) of two sets A and B (denoted by A-B
). Always pay attention to the order of left
and right
. The set A-B
consists of elements that are in A but not in B.
# The following statement returns the elements in the first query result but not in the second query result.\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge) \\\n MINUS \\\n GO FROM \"player102\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player125\" |\n+-------------+\n\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) AS id\\\n MINUS \\\n GO FROM \"player100\" OVER follow YIELD dst(edge) AS id;\n+-------------+\n| id |\n+-------------+\n| \"player100\" |\n+-------------+\n\nnebula> MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) ==\"player102\" \\\n RETURN id(v2) AS id\\\n MINUS \\\n MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) ==\"player100\" \\\n RETURN id(v2) AS id;\n+-------------+\n| id |\n+-------------+\n| \"player100\" |\n+-------------+\n\nnebula> UNWIND [1,2,3] AS a RETURN a \\\n MINUS \\\n WITH 4 AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n| 3 |\n+---+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#precedence_of_the_set_operators_and_pipe_operators","title":"Precedence of the set operators and pipe operators","text":"Please note that when a query contains a pipe |
and a set operator, the pipe takes precedence. Refer to Pipe for details. The query GO FROM 1 UNION GO FROM 2 | GO FROM 3
is the same as the query GO FROM 1 UNION (GO FROM 2 | GO FROM 3)
.
nebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS play_dst \\\n UNION \\\n GO FROM \"team200\" OVER serve REVERSELY \\\n YIELD src(edge) AS play_src \\\n | GO FROM $-.play_src OVER follow YIELD dst(edge) AS play_dst;\n\n+-------------+\n| play_dst |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player117\" |\n| \"player105\" |\n+-------------+\n
The above query executes the statements in the red bar first and then executes the statement in the green box.
The parentheses can change the execution priority. For example:
nebula> (GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS play_dst \\\n UNION \\\n GO FROM \"team200\" OVER serve REVERSELY \\\n YIELD src(edge) AS play_dst) \\\n | GO FROM $-.play_dst OVER follow YIELD dst(edge) AS play_dst;\n
In the above query, the statements within the parentheses take precedence. That is, the UNION
operation will be executed first, and its output will be executed as the input of the next operation with pipes.
You can use the following string operators for concatenating, querying, and matching.
Name Description + Concatenates strings. CONTAINS Performs searchings in strings. (NOT) IN Checks whether a value is within a set of values. (NOT) STARTS WITH Performs matchings at the beginning of a string. (NOT) ENDS WITH Performs matchings at the end of a string. Regular expressions Perform string matchings using regular expressions.Note
All the string searchings or matchings are case-sensitive.
"},{"location":"3.ngql-guide/5.operators/7.string/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/5.operators/7.string/#_1","title":"+
","text":"nebula> RETURN 'a' + 'b';\n+-----------+\n| (\"a\"+\"b\") |\n+-----------+\n| \"ab\" |\n+-----------+\nnebula> UNWIND 'a' AS a UNWIND 'b' AS b RETURN a + b;\n+-------+\n| (a+b) |\n+-------+\n| \"ab\" |\n+-------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#contains","title":"CONTAINS
","text":"The CONTAINS
operator requires string types on both left and right sides.
nebula> MATCH (s:player)-[e:serve]->(t:team) WHERE id(s) == \"player101\" \\\n AND t.team.name CONTAINS \"ets\" RETURN s.player.name, e.start_year, e.end_year, t.team.name;\n+---------------+--------------+------------+-------------+\n| s.player.name | e.start_year | e.end_year | t.team.name |\n+---------------+--------------+------------+-------------+\n| \"Tony Parker\" | 2018 | 2019 | \"Hornets\" |\n+---------------+--------------+------------+-------------+\n\nnebula> GO FROM \"player101\" OVER serve WHERE (STRING)properties(edge).start_year CONTAINS \"19\" AND \\\n properties($^).name CONTAINS \"ny\" \\\n YIELD properties($^).name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------------+-----------------------------+---------------------------+---------------------+\n| properties($^).name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------------+-----------------------------+---------------------------+---------------------+\n| \"Tony Parker\" | 1999 | 2018 | \"Spurs\" |\n+---------------------+-----------------------------+---------------------------+---------------------+\n\nnebula> GO FROM \"player101\" OVER serve WHERE !(properties($$).name CONTAINS \"ets\") \\\n YIELD properties($^).name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------------+-----------------------------+---------------------------+---------------------+\n| properties($^).name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------------+-----------------------------+---------------------------+---------------------+\n| \"Tony Parker\" | 1999 | 2018 | \"Spurs\" |\n+---------------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_in","title":"(NOT) IN
","text":"nebula> RETURN 1 IN [1,2,3], \"Yao\" NOT IN [\"Yi\", \"Tim\", \"Kobe\"], NULL IN [\"Yi\", \"Tim\", \"Kobe\"];\n+----------------+------------------------------------+-------------------------------+\n| (1 IN [1,2,3]) | (\"Yao\" NOT IN [\"Yi\",\"Tim\",\"Kobe\"]) | (NULL IN [\"Yi\",\"Tim\",\"Kobe\"]) |\n+----------------+------------------------------------+-------------------------------+\n| true | true | __NULL__ |\n+----------------+------------------------------------+-------------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_starts_with","title":"(NOT) STARTS WITH
","text":"nebula> RETURN 'apple' STARTS WITH 'app', 'apple' STARTS WITH 'a', 'apple' STARTS WITH toUpper('a');\n+-----------------------------+---------------------------+------------------------------------+\n| (\"apple\" STARTS WITH \"app\") | (\"apple\" STARTS WITH \"a\") | (\"apple\" STARTS WITH toUpper(\"a\")) |\n+-----------------------------+---------------------------+------------------------------------+\n| true | true | false |\n+-----------------------------+---------------------------+------------------------------------+\n\nnebula> RETURN 'apple' STARTS WITH 'b','apple' NOT STARTS WITH 'app';\n+---------------------------+---------------------------------+\n| (\"apple\" STARTS WITH \"b\") | (\"apple\" NOT STARTS WITH \"app\") |\n+---------------------------+---------------------------------+\n| false | false |\n+---------------------------+---------------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_ends_with","title":"(NOT) ENDS WITH
","text":"nebula> RETURN 'apple' ENDS WITH 'app', 'apple' ENDS WITH 'e', 'apple' ENDS WITH 'E', 'apple' ENDS WITH 'b';\n+---------------------------+-------------------------+-------------------------+-------------------------+\n| (\"apple\" ENDS WITH \"app\") | (\"apple\" ENDS WITH \"e\") | (\"apple\" ENDS WITH \"E\") | (\"apple\" ENDS WITH \"b\") |\n+---------------------------+-------------------------+-------------------------+-------------------------+\n| false | true | false | false |\n+---------------------------+-------------------------+-------------------------+-------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#regular_expressions","title":"Regular expressions","text":"Note
Regular expressions cannot work with native nGQL statements (GO
, FETCH
, LOOKUP
, etc.). Use it in openCypher only (MATCH
, WHERE
, etc.).
NebulaGraph supports filtering by using regular expressions. The regular expression syntax is inherited from std::regex
. You can match on regular expressions by using =~ 'regexp'
. For example:
nebula> RETURN \"384748.39\" =~ \"\\\\d+(\\\\.\\\\d{2})?\";\n+--------------------------------+\n| (\"384748.39\"=~\"\\d+(\\.\\d{2})?\") |\n+--------------------------------+\n| true |\n+--------------------------------+\n\nnebula> MATCH (v:player) WHERE v.player.name =~ 'Tony.*' RETURN v.player.name;\n+---------------+\n| v.player.name |\n+---------------+\n| \"Tony Parker\" |\n+---------------+\n
"},{"location":"3.ngql-guide/5.operators/8.list/","title":"List operators","text":"NebulaGraph supports the following list operators:
List operator Description + Concatenates lists. IN Checks if an element exists in a list. [] Accesses an element(s) in a list using the index operator."},{"location":"3.ngql-guide/5.operators/8.list/#examples","title":"Examples","text":"nebula> YIELD [1,2,3,4,5]+[6,7] AS myList;\n+-----------------------+\n| myList |\n+-----------------------+\n| [1, 2, 3, 4, 5, 6, 7] |\n+-----------------------+\n\nnebula> RETURN size([NULL, 1, 2]);\n+------------------+\n| size([NULL,1,2]) |\n+------------------+\n| 3 |\n+------------------+\n\nnebula> RETURN NULL IN [NULL, 1];\n+--------------------+\n| (NULL IN [NULL,1]) |\n+--------------------+\n| __NULL__ |\n+--------------------+\n\nnebula> WITH [2, 3, 4, 5] AS numberlist \\\n UNWIND numberlist AS number \\\n WITH number \\\n WHERE number IN [2, 3, 8] \\\n RETURN number;\n+--------+\n| number |\n+--------+\n| 2 |\n| 3 |\n+--------+\n\nnebula> WITH ['Anne', 'John', 'Bill', 'Diane', 'Eve'] AS names RETURN names[1] AS result;\n+--------+\n| result |\n+--------+\n| \"John\" |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/9.precedence/","title":"Operator precedence","text":"The following list shows the precedence of nGQL operators in descending order. Operators that are shown together on a line have the same precedence.
-
(negative number)!
, NOT
*
, /
, %
-
, +
==
, >=
, >
, <=
, <
, <>
, !=
AND
OR
, XOR
=
(assignment)For operators that occur at the same precedence level within an expression, evaluation proceeds left to right, with the exception that assignments evaluate right to left.
The precedence of operators determines the order of evaluation of terms in an expression. To modify this order and group terms explicitly, use parentheses.
"},{"location":"3.ngql-guide/5.operators/9.precedence/#examples","title":"Examples","text":"nebula> RETURN 2+3*5;\n+-----------+\n| (2+(3*5)) |\n+-----------+\n| 17 |\n+-----------+\n\nnebula> RETURN (2+3)*5;\n+-----------+\n| ((2+3)*5) |\n+-----------+\n| 25 |\n+-----------+\n
"},{"location":"3.ngql-guide/5.operators/9.precedence/#opencypher_compatibility","title":"OpenCypher compatibility","text":"In openCypher, comparisons can be chained arbitrarily, e.g., x < y <= z
is equivalent to x < y AND y <= z
in openCypher.
But in nGQL, x < y <= z
is equivalent to (x < y) <= z
. The result of (x < y)
is a boolean. Compare it with an integer z
, and you will get the final result NULL
.
This topic describes the built-in math functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#abs","title":"abs()","text":"abs() returns the absolute value of the argument.
Syntax: abs(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN abs(-10);\n+------------+\n| abs(-(10)) |\n+------------+\n| 10 |\n+------------+\nnebula> RETURN abs(5-6);\n+------------+\n| abs((5-6)) |\n+------------+\n| 1 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#floor","title":"floor()","text":"floor() returns the largest integer value smaller than or equal to the argument.(Rounds down)
Syntax: floor(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN floor(9.9);\n+------------+\n| floor(9.9) |\n+------------+\n| 9.0 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#ceil","title":"ceil()","text":"ceil() returns the smallest integer greater than or equal to the argument.(Rounds up)
Syntax: ceil(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN ceil(9.1);\n+-----------+\n| ceil(9.1) |\n+-----------+\n| 10.0 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#round","title":"round()","text":"round() returns the rounded value of the specified number. Pay attention to the floating-point precision when using this function.
Syntax: round(<expression>, <digit>)
expression
: An expression of which the result type is double.digit
: Decimal digits. If digit
is less than 0, round at the left of the decimal point.Example:
nebula> RETURN round(314.15926, 2);\n+--------------------+\n| round(314.15926,2) |\n+--------------------+\n| 314.16 |\n+--------------------+\nnebula> RETURN round(314.15926, -1);\n+-----------------------+\n| round(314.15926,-(1)) |\n+-----------------------+\n| 310.0 |\n+-----------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sqrt","title":"sqrt()","text":"sqrt() returns the square root of the argument.
Syntax: sqrt(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sqrt(9);\n+---------+\n| sqrt(9) |\n+---------+\n| 3.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#cbrt","title":"cbrt()","text":"cbrt() returns the cubic root of the argument.
Syntax: cbrt(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN cbrt(8);\n+---------+\n| cbrt(8) |\n+---------+\n| 2.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#hypot","title":"hypot()","text":"hypot() returns the hypotenuse of a right-angled triangle.
Syntax: hypot(<expression_x>,<expression_y>)
expression_x
, expression_y
: An expression of which the result type is double. They represent the side lengths x and y of a right triangle.Example:
nebula> RETURN hypot(3,2*2);\n+----------------+\n| hypot(3,(2*2)) |\n+----------------+\n| 5.0 |\n+----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#pow","title":"pow()","text":"pow() returns the result of xy.
Syntax: pow(<expression_x>,<expression_y>,)
expression_x
: An expression of which the result type is double. It represents the base x
.expression_y
: An expression of which the result type is double. It represents the exponential y
.Example:
nebula> RETURN pow(3,3);\n+----------+\n| pow(3,3) |\n+----------+\n| 27 |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#exp","title":"exp()","text":"exp() returns the result of ex.
Syntax: exp(<expression>)
expression
: An expression of which the result type is double. It represents the exponential x
.Example:
nebula> RETURN exp(2);\n+------------------+\n| exp(2) |\n+------------------+\n| 7.38905609893065 |\n+------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#exp2","title":"exp2()","text":"exp2() returns the result of 2x.
Syntax: exp2(<expression>)
expression
: An expression of which the result type is double. It represents the exponential x
.Example:
nebula> RETURN exp2(3);\n+---------+\n| exp2(3) |\n+---------+\n| 8.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log","title":"log()","text":"log() returns the base-e logarithm of the argument. (\\(log_{e}{N}\\))
Syntax: log(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log(8);\n+--------------------+\n| log(8) |\n+--------------------+\n| 2.0794415416798357 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log2","title":"log2()","text":"log2() returns the base-2 logarithm of the argument. (\\(log_{2}{N}\\))
Syntax: log2(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log2(8);\n+---------+\n| log2(8) |\n+---------+\n| 3.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log10","title":"log10()","text":"log10() returns the base-10 logarithm of the argument. (\\(log_{10}{N}\\))
Syntax: log10(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log10(100);\n+------------+\n| log10(100) |\n+------------+\n| 2.0 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sin","title":"sin()","text":"sin() returns the sine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: sin(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sin(3);\n+--------------------+\n| sin(3) |\n+--------------------+\n| 0.1411200080598672 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#asin","title":"asin()","text":"asin() returns the inverse sine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: asin(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN asin(0.5);\n+--------------------+\n| asin(0.5) |\n+--------------------+\n| 0.5235987755982989 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#cos","title":"cos()","text":"cos() returns the cosine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: cos(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN cos(0.5);\n+--------------------+\n| cos(0.5) |\n+--------------------+\n| 0.8775825618903728 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#acos","title":"acos()","text":"acos() returns the inverse cosine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: acos(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN acos(0.5);\n+--------------------+\n| acos(0.5) |\n+--------------------+\n| 1.0471975511965979 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#tan","title":"tan()","text":"tan() returns the tangent of the argument. Users can convert angles to radians using the function radians()
.
Syntax: tan(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN tan(0.5);\n+--------------------+\n| tan(0.5) |\n+--------------------+\n| 0.5463024898437905 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#atan","title":"atan()","text":"atan() returns the inverse tangent of the argument. Users can convert angles to radians using the function radians()
.
Syntax: atan(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN atan(0.5);\n+--------------------+\n| atan(0.5) |\n+--------------------+\n| 0.4636476090008061 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand","title":"rand()","text":"rand() returns a random floating point number in the range from 0 (inclusive) to 1 (exclusive); i.e.[0,1).
Syntax: rand()
Example:
nebula> RETURN rand();\n+--------------------+\n| rand() |\n+--------------------+\n| 0.6545837172298736 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand32","title":"rand32()","text":"rand32() returns a random 32-bit integer in [min, max)
.
Syntax: rand32(<expression_min>,<expression_max>)
expression_min
: An expression of which the result type is int. It represents the minimum min
.expression_max
: An expression of which the result type is int. It represents the maximum max
.max
and min
is 0
by default. If you set no argument, the system returns a random signed 32-bit integer.Example:
nebula> RETURN rand32(1,100);\n+---------------+\n| rand32(1,100) |\n+---------------+\n| 63 |\n+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand64","title":"rand64()","text":"rand64() returns a random 64-bit integer in [min, max)
.
Syntax: rand64(<expression_min>,<expression_max>)
expression_min
: An expression of which the result type is int. It represents the minimum min
.expression_max
: An expression of which the result type is int. It represents the maximum max
.max
and min
is 0
by default. If you set no argument, the system returns a random signed 64-bit integer.Example:
nebula> RETURN rand64(1,100);\n+---------------+\n| rand64(1,100) |\n+---------------+\n| 34 |\n+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_and","title":"bit_and()","text":"bit_and() returns the result of bitwise AND.
Syntax: bit_and(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_and(5,6);\n+--------------+\n| bit_and(5,6) |\n+--------------+\n| 4 |\n+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_or","title":"bit_or()","text":"bit_or() returns the result of bitwise OR.
Syntax: bit_or(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_or(5,6);\n+-------------+\n| bit_or(5,6) |\n+-------------+\n| 7 |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_xor","title":"bit_xor()","text":"bit_xor() returns the result of bitwise XOR.
Syntax: bit_xor(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_xor(5,6);\n+--------------+\n| bit_xor(5,6) |\n+--------------+\n| 3 |\n+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#size","title":"size()","text":"size() returns the number of elements in a list or a map, or the length of a string.
Syntax: size({<expression>|<string>})
expression
: An expression for a list or map.string
: A specified string.Example:
nebula> RETURN size([1,2,3,4]);\n+-----------------+\n| size([1,2,3,4]) |\n+-----------------+\n| 4 |\n+-----------------+\n
nebula> RETURN size(\"basketballplayer\") as size;\n+------+\n| size |\n+------+\n| 16 |\n+------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#range","title":"range()","text":"range() returns a list of integers from [start,end]
in the specified steps.
Syntax: range(<expression_start>,<expression_end>[,<expression_step>])
expression_start
: An expression of which the result type is int. It represents the starting value start
.expression_end
: An expression of which the result type is int. It represents the end value end
.expression_step
: An expression of which the result type is int. It represents the step size step
, step
is 1 by default.Example:
nebula> RETURN range(1,3*3,2);\n+------------------+\n| range(1,(3*3),2) |\n+------------------+\n| [1, 3, 5, 7, 9] |\n+------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sign","title":"sign()","text":"sign() returns the signum of the given number. If the number is 0
, the system returns 0
. If the number is negative, the system returns -1
. If the number is positive, the system returns 1
.
Syntax: sign(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sign(10);\n+----------+\n| sign(10) |\n+----------+\n| 1 |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#e","title":"e()","text":"e() returns the base of the natural logarithm, e (2.718281828459045).
Syntax: e()
Example:
nebula> RETURN e();\n+-------------------+\n| e() |\n+-------------------+\n| 2.718281828459045 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#pi","title":"pi()","text":"pi() returns the mathematical constant pi (3.141592653589793).
Syntax: pi()
Example:
nebula> RETURN pi();\n+-------------------+\n| pi() |\n+-------------------+\n| 3.141592653589793 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#radians","title":"radians()","text":"radians() converts angles to radians.
Syntax: radians(<angle>)
Example:
nebula> RETURN radians(180);\n+-------------------+\n| radians(180) |\n+-------------------+\n| 3.141592653589793 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/","title":"Geography functions","text":"Geography functions are used to generate or perform operations on the value of the geography data type.
For descriptions of the geography data types, see Geography.
"},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/#descriptions","title":"Descriptions","text":"Function Return Type Description ST_Point(longitude, latitude)GEOGRAPHY
Creates the geography that contains a point. ST_GeogFromText(wkt_string) GEOGRAPHY
Returns the geography corresponding to the input WKT string. ST_ASText(geography) STRING
Returns the WKT string of the input geography. ST_Centroid(geography) GEOGRAPHY
Returns the centroid of the input geography in the form of the single point geography. ST_ISValid(geography) BOOL
Returns whether the input geography is valid. ST_Intersects(geography_1, geography_2) BOOL
Returns whether geography_1 and geography_2 have intersections. ST_Covers(geography_1, geography_2) BOOL
Returns whether geography_1 completely contains geography_2. If there is no point outside geography_1 in geography_2, return True. ST_CoveredBy(geography_1, geography_2) BOOL
Returns whether geography_2 completely contains geography_1.If there is no point outside geography_2 in geography_1, return True. ST_DWithin(geography_1, geography_2, distance) BOOL
If the distance between one point (at least) in geography_1 and one point in geography_2 is less than or equal to the distance specified by the distance parameter (measured by meters), return True. ST_Distance(geography_1, geography_2) FLOAT
Returns the smallest possible distance (measured by meters) between two non-empty geographies. S2_CellIdFromPoint(point_geography) INT
Returns the S2 Cell ID that covers the point geography. S2_CoveringCellIds(geography) ARRAY<INT64>
Returns an array of S2 Cell IDs that cover the input geography."},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/#examples","title":"Examples","text":"nebula> RETURN ST_ASText(ST_Point(1,1));\n+--------------------------+\n| ST_ASText(ST_Point(1,1)) |\n+--------------------------+\n| \"POINT(1 1)\" |\n+--------------------------+\n\nnebula> RETURN ST_ASText(ST_GeogFromText(\"POINT(3 8)\"));\n+------------------------------------------+\n| ST_ASText(ST_GeogFromText(\"POINT(3 8)\")) |\n+------------------------------------------+\n| \"POINT(3 8)\" |\n+------------------------------------------+\n\nnebula> RETURN ST_ASTEXT(ST_Centroid(ST_GeogFromText(\"LineString(0 1,1 0)\")));\n+----------------------------------------------------------------+\n| ST_ASTEXT(ST_Centroid(ST_GeogFromText(\"LineString(0 1,1 0)\"))) |\n+----------------------------------------------------------------+\n| \"POINT(0.5000380800773782 0.5000190382261059)\" |\n+----------------------------------------------------------------+\n\nnebula> RETURN ST_ISValid(ST_GeogFromText(\"POINT(3 8)\"));\n+-------------------------------------------+\n| ST_ISValid(ST_GeogFromText(\"POINT(3 8)\")) |\n+-------------------------------------------+\n| true |\n+-------------------------------------------+\n\nnebula> RETURN ST_Intersects(ST_GeogFromText(\"LineString(0 1,1 0)\"),ST_GeogFromText(\"LineString(0 0,1 1)\"));\n+----------------------------------------------------------------------------------------------+\n| ST_Intersects(ST_GeogFromText(\"LineString(0 1,1 0)\"),ST_GeogFromText(\"LineString(0 0,1 1)\")) |\n+----------------------------------------------------------------------------------------------+\n| true |\n+----------------------------------------------------------------------------------------------+\n\nnebula> RETURN ST_Covers(ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"),ST_Point(1,2));\n+--------------------------------------------------------------------------------+\n| ST_Covers(ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"),ST_Point(1,2)) |\n+--------------------------------------------------------------------------------+\n| true |\n+--------------------------------------------------------------------------------+\n\nnebula> RETURN ST_CoveredBy(ST_Point(1,2),ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"));\n+-----------------------------------------------------------------------------------+\n| ST_CoveredBy(ST_Point(1,2),ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\")) |\n+-----------------------------------------------------------------------------------+\n| true |\n+-----------------------------------------------------------------------------------+\n\nnebula> RETURN ST_dwithin(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"),20000000000.0);\n+---------------------------------------------------------------------------------------+\n| ST_dwithin(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"),20000000000) |\n+---------------------------------------------------------------------------------------+\n| true |\n+---------------------------------------------------------------------------------------+\n\nnebula> RETURN ST_Distance(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"));\n+----------------------------------------------------------------------------+\n| ST_Distance(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\")) |\n+----------------------------------------------------------------------------+\n| 1.5685230187677438e+06 |\n+----------------------------------------------------------------------------+\n\nnebula> RETURN S2_CellIdFromPoint(ST_GeogFromText(\"Point(1 1)\"));\n+---------------------------------------------------+\n| S2_CellIdFromPoint(ST_GeogFromText(\"Point(1 1)\")) |\n+---------------------------------------------------+\n| 1153277837650709461 |\n+---------------------------------------------------+\n\nnebula> RETURN S2_CoveringCellIds(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| S2_CoveringCellIds(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\")) |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [1152391494368201343, 1153466862374223872, 1153554823304445952, 1153836298281156608, 1153959443583467520, 1154240918560178176, 1160503736791990272, 1160591697722212352] |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/","title":"Aggregating functions","text":"This topic describes the aggregating functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#avg","title":"avg()","text":"avg() returns the average value of the argument.
Syntax: avg(<expression>)
Example:
nebula> MATCH (v:player) RETURN avg(v.player.age);\n+--------------------+\n| avg(v.player.age) |\n+--------------------+\n| 33.294117647058826 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#count","title":"count()","text":"count() returns the number of records.
count()
and GROUP BY
together to group and count the number of parameters. Use YIELD
to return.count()
and RETURN
. GROUP BY
is not necessary.Syntax: count({<expression> | *})
Example:
nebula> WITH [NULL, 1, 1, 2, 2] As a UNWIND a AS b \\\n RETURN count(b), count(*), count(DISTINCT b);\n+----------+----------+-------------------+\n| count(b) | count(*) | count(distinct b) |\n+----------+----------+-------------------+\n| 4 | 5 | 2 |\n+----------+----------+-------------------+\n
# The statement in the following example searches for the people whom `player101` follows and people who follow `player101`, i.e. a bidirectional query.\n# Group and count the number of parameters.\nnebula> GO FROM \"player101\" OVER follow BIDIRECT \\\n YIELD properties($$).name AS Name \\\n | GROUP BY $-.Name YIELD $-.Name, count(*);\n+---------------------+----------+\n| $-.Name | count(*) |\n+---------------------+----------+\n| \"LaMarcus Aldridge\" | 2 |\n| \"Tim Duncan\" | 2 |\n| \"Marco Belinelli\" | 1 |\n| \"Manu Ginobili\" | 1 |\n| \"Boris Diaw\" | 1 |\n| \"Dejounte Murray\" | 1 |\n+---------------------+----------+\n\n# Count the number of parameters.\nnebula> MATCH (v1:player)-[:follow]-(v2:player) \\\n WHERE id(v1)== \"player101\" \\\n RETURN v2.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;\n+---------------------+-----+\n| Name | cnt |\n+---------------------+-----+\n| \"LaMarcus Aldridge\" | 2 |\n| \"Tim Duncan\" | 2 |\n| \"Boris Diaw\" | 1 |\n| \"Manu Ginobili\" | 1 |\n| \"Dejounte Murray\" | 1 |\n| \"Marco Belinelli\" | 1 |\n+---------------------+-----+\n
The preceding example retrieves two columns:
$-.Name
: the names of the people.count(*)
: how many times the names show up.Because there are no duplicate names in the basketballplayer
dataset, the number 2
in the column count(*)
shows that the person in that row and player101
have followed each other.
# a: The statement in the following example retrieves the age distribution of the players in the dataset.\nnebula> LOOKUP ON player \\\n YIELD player.age As playerage \\\n | GROUP BY $-.playerage \\\n YIELD $-.playerage as age, count(*) AS number \\\n | ORDER BY $-.number DESC, $-.age DESC;\n+-----+--------+\n| age | number |\n+-----+--------+\n| 34 | 4 |\n| 33 | 4 |\n| 30 | 4 |\n| 29 | 4 |\n| 38 | 3 |\n+-----+--------+\n...\n# b: The statement in the following example retrieves the age distribution of the players in the dataset.\nnebula> MATCH (n:player) \\\n RETURN n.player.age as age, count(*) as number \\\n ORDER BY number DESC, age DESC;\n+-----+--------+\n| age | number |\n+-----+--------+\n| 34 | 4 |\n| 33 | 4 |\n| 30 | 4 |\n| 29 | 4 |\n| 38 | 3 |\n+-----+--------+\n...\n
# The statement in the following example counts the number of edges that Tim Duncan relates.\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) -[e]- (v2) \\\n RETURN count(e);\n+----------+\n| count(e) |\n+----------+\n| 13 |\n+----------+\n\n# The statement in the following example counts the number of edges that Tim Duncan relates and returns two columns (no DISTINCT and DISTINCT) in multi-hop queries.\nnebula> MATCH (n:player {name : \"Tim Duncan\"})-[]->(friend:player)-[]->(fof:player) \\\n RETURN count(fof), count(DISTINCT fof);\n+------------+---------------------+\n| count(fof) | count(distinct fof) |\n+------------+---------------------+\n| 4 | 3 |\n+------------+---------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#max","title":"max()","text":"max() returns the maximum value.
Syntax: max(<expression>)
Example:
nebula> MATCH (v:player) RETURN max(v.player.age);\n+-------------------+\n| max(v.player.age) |\n+-------------------+\n| 47 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#min","title":"min()","text":"min() returns the minimum value.
Syntax: min(<expression>)
Example:
nebula> MATCH (v:player) RETURN min(v.player.age);\n+-------------------+\n| min(v.player.age) |\n+-------------------+\n| 20 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#collect","title":"collect()","text":"collect() returns a list containing the values returned by an expression. Using this function aggregates data by merging multiple records or values into a single list.
Syntax: collect(<expression>)
Example:
nebula> UNWIND [1, 2, 1] AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n| 1 |\n+---+\n\nnebula> UNWIND [1, 2, 1] AS a \\\n RETURN collect(a);\n+------------+\n| collect(a) |\n+------------+\n| [1, 2, 1] |\n+------------+\n\nnebula> UNWIND [1, 2, 1] AS a \\\n RETURN a, collect(a), size(collect(a));\n+---+------------+------------------+\n| a | collect(a) | size(collect(a)) |\n+---+------------+------------------+\n| 2 | [2] | 1 |\n| 1 | [1, 1] | 2 |\n+---+------------+------------------+\n\n# The following examples sort the results in descending order, limit output rows to 3, and collect the output into a list.\nnebula> UNWIND [\"c\", \"b\", \"a\", \"d\" ] AS p \\\n WITH p AS q \\\n ORDER BY q DESC LIMIT 3 \\\n RETURN collect(q);\n+-----------------+\n| collect(q) |\n+-----------------+\n| [\"d\", \"c\", \"b\"] |\n+-----------------+\n\nnebula> WITH [1, 1, 2, 2] AS coll \\\n UNWIND coll AS x \\\n WITH DISTINCT x \\\n RETURN collect(x) AS ss;\n+--------+\n| ss |\n+--------+\n| [1, 2] |\n+--------+\n\nnebula> MATCH (n:player) \\\n RETURN collect(n.player.age);\n+---------------------------------------------------------------+\n| collect(n.player.age) |\n+---------------------------------------------------------------+\n| [32, 32, 34, 29, 41, 40, 33, 25, 40, 37, ...\n...\n\n# The following example aggregates all the players' names by their ages.\nnebula> MATCH (n:player) \\\n RETURN n.player.age AS age, collect(n.player.name);\n+-----+--------------------------------------------------------------------------+\n| age | collect(n.player.name) |\n+-----+--------------------------------------------------------------------------+\n| 24 | [\"Giannis Antetokounmpo\"] |\n| 20 | [\"Luka Doncic\"] |\n| 25 | [\"Joel Embiid\", \"Kyle Anderson\"] |\n+-----+--------------------------------------------------------------------------+\n...\n\nnebula> GO FROM \"player100\" OVER serve \\\n YIELD properties($$).name AS name \\\n | GROUP BY $-.name \\\n YIELD collect($-.name) AS name;\n+-----------+\n| name |\n+-----------+\n| [\"Spurs\"] |\n+-----------+\n\nnebula> LOOKUP ON player \\\n YIELD player.age As playerage \\\n | GROUP BY $-.playerage \\\n YIELD collect($-.playerage) AS playerage;\n+------------------+\n| playerage |\n+------------------+\n| [22] |\n| [47] |\n| [43] |\n| [25, 25] |\n+------------------+\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#std","title":"std()","text":"std() returns the population standard deviation.
Syntax: std(<expression>)
Example:
nebula> MATCH (v:player) RETURN std(v.player.age);\n+-------------------+\n| std(v.player.age) |\n+-------------------+\n| 6.423895701687502 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#sum","title":"sum()","text":"sum() returns the sum value.
Syntax: sum(<expression>)
Example:
nebula> MATCH (v:player) RETURN sum(v.player.age);\n+-------------------+\n| sum(v.player.age) |\n+-------------------+\n| 1698 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#aggregating_example","title":"Aggregating example","text":"nebula> GO FROM \"player100\" OVER follow YIELD dst(edge) AS dst, properties($$).age AS age \\\n | GROUP BY $-.dst \\\n YIELD \\\n $-.dst AS dst, \\\n toInteger((sum($-.age)/count($-.age)))+avg(distinct $-.age+1)+1 AS statistics;\n+-------------+------------+\n| dst | statistics |\n+-------------+------------+\n| \"player125\" | 84.0 |\n| \"player101\" | 74.0 |\n+-------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/","title":"Type conversion functions","text":"This topic describes the type conversion functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#toboolean","title":"toBoolean()","text":"toBoolean() converts a string value to a boolean value.
Syntax: toBoolean(<value>)
Example:
nebula> UNWIND [true, false, 'true', 'false', NULL] AS b \\\n RETURN toBoolean(b) AS b;\n+----------+\n| b |\n+----------+\n| true |\n| false |\n| true |\n| false |\n| __NULL__ |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tofloat","title":"toFloat()","text":"toFloat() converts an integer or string value to a floating point number.
Syntax: toFloat(<value>)
Example:
nebula> RETURN toFloat(1), toFloat('1.3'), toFloat('1e3'), toFloat('not a number');\n+------------+----------------+----------------+-------------------------+\n| toFloat(1) | toFloat(\"1.3\") | toFloat(\"1e3\") | toFloat(\"not a number\") |\n+------------+----------------+----------------+-------------------------+\n| 1.0 | 1.3 | 1000.0 | __NULL__ |\n+------------+----------------+----------------+-------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tostring","title":"toString()","text":"toString() converts non-compound types of data, such as numbers, booleans, and so on, to strings.
Syntax: toString(<value>)
Example:
nebula> RETURN toString(9669) AS int2str, toString(null) AS null2str;\n+---------+----------+\n| int2str | null2str |\n+---------+----------+\n| \"9669\" | __NULL__ |\n+---------+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tointeger","title":"toInteger()","text":"toInteger() converts a floating point or string value to an integer value.
Syntax: toInteger(<value>)
Example:
nebula> RETURN toInteger(1), toInteger('1'), toInteger('1e3'), toInteger('not a number');\n+--------------+----------------+------------------+---------------------------+\n| toInteger(1) | toInteger(\"1\") | toInteger(\"1e3\") | toInteger(\"not a number\") |\n+--------------+----------------+------------------+---------------------------+\n| 1 | 1 | 1000 | __NULL__ |\n+--------------+----------------+------------------+---------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#toset","title":"toSet()","text":"toSet() converts a list or set value to a set value.
Syntax: toSet(<value>)
Example:
nebula> RETURN toSet(list[1,2,3,1,2]) AS list2set;\n+-----------+\n| list2set |\n+-----------+\n| {3, 1, 2} |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#hash","title":"hash()","text":"hash() returns the hash value of the argument. The argument can be a number, a string, a list, a boolean, null, or an expression that evaluates to a value of the preceding data types.
The source code of the hash()
function (MurmurHash2), seed (0xc70f6907UL
), and other parameters can be found in MurmurHash2.h
.
For Java, the hash function operates as follows.
MurmurHash2.hash64(\"to_be_hashed\".getBytes(),\"to_be_hashed\".getBytes().length, 0xc70f6907)\n
Syntax: hash(<string>)
Example:
nebula> RETURN hash(\"abcde\");\n+--------------------+\n| hash(\"abcde\") |\n+--------------------+\n| 811036730794841393 |\n+--------------------+\nnebula> YIELD hash([1,2,3]);\n+----------------+\n| hash([1,2,3]) |\n+----------------+\n| 11093822460243 |\n+----------------+\nnebula> YIELD hash(NULL);\n+------------+\n| hash(NULL) |\n+------------+\n| -1 |\n+------------+\nnebula> YIELD hash(toLower(\"HELLO NEBULA\"));\n+-------------------------------+\n| hash(toLower(\"HELLO NEBULA\")) |\n+-------------------------------+\n| -8481157362655072082 |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/","title":"Built-in string functions","text":"This topic describes the built-in string functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#precautions","title":"Precautions","text":"1
, while in C language it starts from 0
.strcasecmp() compares string a and b without case sensitivity.
Syntax: strcasecmp(<string_a>,<string_b>)
string_a
, string_b
: Strings to compare.string_a = string_b
, the return value is 0
. When string_a > string_b
, the return value is greater than 0
. When string_a < string_b
, the return value is less than 0
.Example:
nebula> RETURN strcasecmp(\"a\",\"aa\");\n+----------------------+\n| strcasecmp(\"a\",\"aa\") |\n+----------------------+\n| -97 |\n+----------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#lower_and_tolower","title":"lower() and toLower()","text":"lower() and toLower() can both returns the argument in lowercase.
Syntax: lower(<string>)
, toLower(<string>)
string
: A specified string.Example:
nebula> RETURN lower(\"Basketball_Player\");\n+----------------------------+\n| lower(\"Basketball_Player\") |\n+----------------------------+\n| \"basketball_player\" |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#upper_and_toupper","title":"upper() and toUpper()","text":"upper() and toUpper() can both returns the argument in uppercase.
Syntax: upper(<string>)
, toUpper(<string>)
string
: A specified string.Example:
nebula> RETURN upper(\"Basketball_Player\");\n+----------------------------+\n| upper(\"Basketball_Player\") |\n+----------------------------+\n| \"BASKETBALL_PLAYER\" |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#length","title":"length()","text":"length() returns the length of the given string in bytes.
Syntax: length({<string>|<path>})
string
: A specified string.path
: A specified path represented by a variable.Example:
nebula> RETURN length(\"basketball\");\n+----------------------+\n| length(\"basketball\") |\n+----------------------+\n| 10 |\n+----------------------+\n
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) return length(p);\n+-----------+\n| length(p) |\n+-----------+\n| 1 |\n| 1 |\n| 1 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#trim","title":"trim()","text":"trim() removes the spaces at the leading and trailing of the string.
Syntax: trim(<string>)
string
: A specified string.Example:
nebula> RETURN trim(\" basketball player \");\n+-----------------------------+\n| trim(\" basketball player \") |\n+-----------------------------+\n| \"basketball player\" |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#ltrim","title":"ltrim()","text":"ltrim() removes the spaces at the leading of the string.
Syntax: ltrim(<string>)
string
: A specified string.Example:
nebula> RETURN ltrim(\" basketball player \");\n+------------------------------+\n| ltrim(\" basketball player \") |\n+------------------------------+\n| \"basketball player \" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#rtrim","title":"rtrim()","text":"rtrim() removes the spaces at the trailing of the string.
Syntax: rtrim(<string>)
string
: A specified string.Example:
nebula> RETURN rtrim(\" basketball player \");\n+------------------------------+\n| rtrim(\" basketball player \") |\n+------------------------------+\n| \" basketball player\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#left","title":"left()","text":"left() returns a substring consisting of several characters from the leading of a string.
Syntax: left(<string>,<count>)
string
: A specified string.count
: The number of characters from the leading of the string. If the string is shorter than count
, the system returns the string itself.Example:
nebula> RETURN left(\"basketball_player\",6);\n+-----------------------------+\n| left(\"basketball_player\",6) |\n+-----------------------------+\n| \"basket\" |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#right","title":"right()","text":"right() returns a substring consisting of several characters from the trailing of a string.
Syntax: right(<string>,<count>)
string
: A specified string.count
: The number of characters from the trailing of the string. If the string is shorter than count
, the system returns the string itself.Example:
nebula> RETURN right(\"basketball_player\",6);\n+------------------------------+\n| right(\"basketball_player\",6) |\n+------------------------------+\n| \"player\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#lpad","title":"lpad()","text":"lpad() pads a specified string from the left-side to the specified length and returns the result string.
Syntax: lpad(<string>,<count>,<letters>)
string
: A specified string.count
: The length of the string after it has been left-padded. If the length is less than that of string
, only the length of string
characters from front to back will be returned.letters
: A string to be padding from the leading.Example:
nebula> RETURN lpad(\"abcd\",10,\"b\");\n+---------------------+\n| lpad(\"abcd\",10,\"b\") |\n+---------------------+\n| \"bbbbbbabcd\" |\n+---------------------+\nnebula> RETURN lpad(\"abcd\",3,\"b\");\n+--------------------+\n| lpad(\"abcd\",3,\"b\") |\n+--------------------+\n| \"abc\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#rpad","title":"rpad()","text":"rpad() pads a specified string from the right-side to the specified length and returns the result string.
Syntax: rpad(<string>,<count>,<letters>)
string
: A specified string.count
: The length of the string after it has been right-padded. If the length is less than that of string
, only the length of string
characters from front to back will be returned.letters
: A string to be padding from the trailing.Example:
nebula> RETURN rpad(\"abcd\",10,\"b\");\n+---------------------+\n| rpad(\"abcd\",10,\"b\") |\n+---------------------+\n| \"abcdbbbbbb\" |\n+---------------------+\nnebula> RETURN rpad(\"abcd\",3,\"b\");\n+--------------------+\n| rpad(\"abcd\",3,\"b\") |\n+--------------------+\n| \"abc\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#substr_and_substring","title":"substr() and substring()","text":"substr() and substring() return a substring extracting count
characters starting from the specified position pos
of a specified string.
Syntax: substr(<string>,<pos>,<count>)
, substring(<string>,<pos>,<count>)
string
: A specified string.pos
: The position of starting extract (character index). Data type is int.count
: The number of characters extracted from the start position onwards.substr()
and substring()
","text":"pos
is 0, it extracts from the specified string leading (including the first character).pos
is greater than the maximum string index, an empty string is returned.pos
is a negative number, BAD_DATA
is returned.count
is omitted, the function returns the substring starting at the position given by pos
and extending to the end of the string.count
is 0, an empty string is returned.NULL
as any of the argument of substr()
will cause an issue.OpenCypher compatibility
In openCypher, if a
is null
, null
is returned.
Example:
nebula> RETURN substr(\"abcdefg\",2,4);\n+-----------------------+\n| substr(\"abcdefg\",2,4) |\n+-----------------------+\n| \"cdef\" |\n+-----------------------+\nnebula> RETURN substr(\"abcdefg\",0,4);\n+-----------------------+\n| substr(\"abcdefg\",0,4) |\n+-----------------------+\n| \"abcd\" |\n+-----------------------+\nnebula> RETURN substr(\"abcdefg\",2);\n+---------------------+\n| substr(\"abcdefg\",2) |\n+---------------------+\n| \"cdefg\" |\n+---------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#reverse","title":"reverse()","text":"reverse() returns a string in reverse order.
Syntax: reverse(<string>)
string
: A specified string.Example:
nebula> RETURN reverse(\"abcdefg\");\n+--------------------+\n| reverse(\"abcdefg\") |\n+--------------------+\n| \"gfedcba\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#replace","title":"replace()","text":"replace() replaces string a in a specified string with string b.
Syntax: replace(<string>,<substr_a>,<string_b>)
string
: A specified string.substr_a
: String a.string_b
: String b.Example:
nebula> RETURN replace(\"abcdefg\",\"cd\",\"AAAAA\");\n+---------------------------------+\n| replace(\"abcdefg\",\"cd\",\"AAAAA\") |\n+---------------------------------+\n| \"abAAAAAefg\" |\n+---------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#split","title":"split()","text":"split() splits a specified string at string b and returns a list of strings.
Syntax: split(<string>,<substr>)
string
: A specified string.substr
: String b.Example:
nebula> RETURN split(\"basketballplayer\",\"a\");\n+-------------------------------+\n| split(\"basketballplayer\",\"a\") |\n+-------------------------------+\n| [\"b\", \"sketb\", \"llpl\", \"yer\"] |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#concat","title":"concat()","text":"concat() returns strings concatenated by all strings.
Syntax: concat(<string1>,<string2>,...)
NULL
, NULL
is returned.Example:
//This example concatenates 1, 2, and 3.\nnebula> RETURN concat(\"1\",\"2\",\"3\") AS r;\n+-------+\n| r |\n+-------+\n| \"123\" |\n+-------+\n\n//In this example, one of the string is NULL.\nnebula> RETURN concat(\"1\",\"2\",NULL) AS r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\nnebula> GO FROM \"player100\" over follow \\\n YIELD concat(src(edge), properties($^).age, properties($$).name, properties(edge).degree) AS A;\n+------------------------------+\n| A |\n+------------------------------+\n| \"player10042Tony Parker95\" |\n| \"player10042Manu Ginobili95\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#concat_ws","title":"concat_ws()","text":"concat_ws() returns strings concatenated by all strings that are delimited with a separator.
Syntax: concat_ws(<separator>,<string1>,<string2>,... )
NULL
, the concat_ws()
function returns NULL
.NULL
and there is only one string, the string itself is returned.NULL
in the strings, NULL
is ignored during the concatenation.Example:
//This example concatenates a, b, and c with the separator +.\nnebula> RETURN concat_ws(\"+\",\"a\",\"b\",\"c\") AS r;\n+---------+\n| r |\n+---------+\n| \"a+b+c\" |\n+---------+\n\n//In this example, the separator is NULL.\nneubla> RETURN concat_ws(NULL,\"a\",\"b\",\"c\") AS r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\n//In this example, the separator is + and there is a NULL in the strings.\nnebula> RETURN concat_ws(\"+\",\"a\",NULL,\"b\",\"c\") AS r;\n+---------+\n| r |\n+---------+\n| \"a+b+c\" |\n+---------+\n\n//In this example, the separator is + and there is only one string.\nnebula> RETURN concat_ws(\"+\",\"a\") AS r;\n+-----+\n| r |\n+-----+\n| \"a\" |\n+-----+\nnebula> GO FROM \"player100\" over follow \\\n YIELD concat_ws(\" \",src(edge), properties($^).age, properties($$).name, properties(edge).degree) AS A;\n+---------------------------------+\n| A |\n+---------------------------------+\n| \"player100 42 Tony Parker 95\" |\n| \"player100 42 Manu Ginobili 95\" |\n+---------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#extract","title":"extract()","text":"extract() uses regular expression matching to retrieve a single substring or all substrings from a string.
Syntax: extract(<string>,\"<regular_expression>\")
string
: A specified stringregular_expression
: A regular expressionExample:
nebula> MATCH (a:player)-[b:serve]-(c:team{name: \"Lakers\"}) \\\n WHERE a.player.age > 45 \\\n RETURN extract(a.player.name, \"\\\\w+\") AS result;\n+----------------------------+\n| result |\n+----------------------------+\n| [\"Shaquille\", \"O\", \"Neal\"] |\n+----------------------------+\n\nnebula> MATCH (a:player)-[b:serve]-(c:team{name: \"Lakers\"}) \\\n WHERE a.player.age > 45 \\\n RETURN extract(a.player.name, \"hello\") AS result;\n+--------+\n| result |\n+--------+\n| [] |\n+--------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#json_extract","title":"json_extract()","text":"json_extract() converts the specified JSON string to a map.
Syntax: extract(<string>)
string
:A specified string, must be JSON string.Caution
Example:
nebula> YIELD json_extract('{\"a\": 1, \"b\": {}, \"c\": {\"d\": true}}') AS result;\n+-----------------------------+\n| result |\n+-----------------------------+\n| {a: 1, b: {}, c: {d: true}} |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/3.date-and-time/","title":"Built-in date and time functions","text":"NebulaGraph supports the following built-in date and time functions:
Function Description int now() Returns the current timestamp of the system. timestamp timestamp() Returns the current timestamp of the system. date date() Returns the current UTC date based on the current system. time time() Returns the current UTC time based on the current system. datetime datetime() Returns the current UTC date and time based on the current system. map duration() Returns the period of time. It can be used to calculate the specified time.For more information, see Date and time types.
"},{"location":"3.ngql-guide/6.functions-and-expressions/3.date-and-time/#examples","title":"Examples","text":"nebula> RETURN now(), timestamp(), date(), time(), datetime();\n+------------+-------------+------------+-----------------+----------------------------+\n| now() | timestamp() | date() | time() | datetime() |\n+------------+-------------+------------+-----------------+----------------------------+\n| 1640057560 | 1640057560 | 2021-12-21 | 03:32:40.351000 | 2021-12-21T03:32:40.351000 |\n+------------+-------------+------------+-----------------+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/","title":"Schema-related functions","text":"This topic describes the schema-related functions supported by NebulaGraph. There are two types of schema-related functions, one for native nGQL statements and the other for openCypher-compatible statements.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#for_ngql_statements","title":"For nGQL statements","text":"The following functions are available in YIELD
and WHERE
clauses of nGQL statements.
Note
Since vertex, edge, vertices, edges, and path are keywords, you need to use AS <alias>
to set the alias, such as GO FROM \"player100\" OVER follow YIELD edge AS e;
.
id(vertex) returns the ID of a vertex.
Syntax: id(vertex)
Example:
nebula> LOOKUP ON player WHERE player.age > 45 YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player144\" |\n| \"player140\" |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#propertiesvertex","title":"properties(vertex)","text":"properties(vertex) returns the properties of a vertex.
Syntax: properties(vertex)
Example:
nebula> LOOKUP ON player WHERE player.age > 45 \\\n YIELD properties(vertex);\n+-------------------------------------+\n| properties(VERTEX) |\n+-------------------------------------+\n| {age: 47, name: \"Shaquille O'Neal\"} |\n| {age: 46, name: \"Grant Hill\"} |\n+-------------------------------------+\n
You can also use the property reference symbols ($^
and $$
) instead of the vertex
field in the properties()
function to get all properties of a vertex.
$^
represents the data of the starting vertex at the beginning of exploration. For example, in GO FROM \"player100\" OVER follow reversely YIELD properties($^)
, $^
refers to the vertex player100
.$$
represents the data of the end vertex at the end of exploration.properties($^)
and properties($$)
are generally used in GO
statements. For more information, see Property reference.
Caution
You can use properties().<property_name>
to get a specific property of a vertex. However, it is not recommended to use this method to obtain specific properties because the properties()
function returns all properties, which can decrease query performance.
properties(edge) returns the properties of an edge.
Syntax: properties(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties(edge);\n+------------------+\n| properties(EDGE) |\n+------------------+\n| {degree: 95} |\n| {degree: 95} |\n+------------------+\n
Caution
You can use properties(edge).<property_name>
to get a specific property of an edge. However, it is not recommended to use this method to obtain specific properties because the properties(edge)
function returns all properties, which can decrease query performance.
type(edge) returns the edge type of an edge.
Syntax: type(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge), type(edge), rank(edge);\n+-------------+-------------+------------+------------+\n| src(EDGE) | dst(EDGE) | type(EDGE) | rank(EDGE) |\n+-------------+-------------+------------+------------+\n| \"player100\" | \"player101\" | \"follow\" | 0 |\n| \"player100\" | \"player125\" | \"follow\" | 0 |\n+-------------+-------------+------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#srcedge","title":"src(edge)","text":"src(edge) returns the source vertex ID of an edge.
Syntax: src(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge);\n+-------------+-------------+\n| src(EDGE) | dst(EDGE) |\n+-------------+-------------+\n| \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" |\n+-------------+-------------+\n
Note
The semantics of the query for the starting vertex with src(edge) and properties($^
) are different. src(edge) indicates the starting vertex ID of the edge in the graph database, while properties($^
) indicates the data of the starting vertex where you start to expand the graph, such as the data of the starting vertex player100
in the above GO statement.
dst(edge) returns the destination vertex ID of an edge.
Syntax: dst(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge);\n+-------------+-------------+\n| src(EDGE) | dst(EDGE) |\n+-------------+-------------+\n| \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" |\n+-------------+-------------+\n
Note
dst(edge) indicates the destination vertex ID of the edge in the graph database.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#rankedge","title":"rank(edge)","text":"rank(edge) returns the rank value of an edge.
Syntax: rank(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge), rank(edge);\n+-------------+-------------+------------+\n| src(EDGE) | dst(EDGE) | rank(EDGE) |\n+-------------+-------------+------------+\n| \"player100\" | \"player101\" | 0 |\n| \"player100\" | \"player125\" | 0 |\n+-------------+-------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#vertex","title":"vertex","text":"vertex returns the information of vertices, including VIDs, tags, properties, and values. You need to use AS <alias>
to set the alias.
Syntax: vertex
Example:
nebula> LOOKUP ON player WHERE player.age > 45 YIELD vertex AS v;\n+----------------------------------------------------------+\n| v |\n+----------------------------------------------------------+\n| (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player140\" :player{age: 46, name: \"Grant Hill\"}) |\n+----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#edge","title":"edge","text":"edge returns the information of edges, including edge types, source vertices, destination vertices, ranks, properties, and values. You need to use AS <alias>
to set the alias.
Syntax: edge
Example:
nebula> GO FROM \"player100\" OVER follow YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#vertices","title":"vertices","text":"vertices returns the information of vertices in a subgraph. For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#edges","title":"edges","text":"edges returns the information of edges in a subgraph. For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#path","title":"path","text":"path returns the information of a path. For more information, see FIND PATH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#for_statements_compatible_with_opencypher","title":"For statements compatible with openCypher","text":"The following functions are available in RETURN
and WHERE
clauses of openCypher-compatible statements.
id() returns the ID of a vertex.
Syntax: id(<vertex>)
Example:
nebula> MATCH (v:player) RETURN id(v); \n+-------------+\n| id(v) |\n+-------------+\n| \"player129\" |\n| \"player115\" |\n| \"player106\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#tags_and_labels","title":"tags() and labels()","text":"tags() and labels() return the Tag of a vertex.
Syntax: tags(<vertex>)
, labels(<vertex>)
Example:
nebula> MATCH (v) WHERE id(v) == \"player100\" \\\n RETURN tags(v);\n+------------+\n| tags(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#properties","title":"properties()","text":"properties() returns the properties of a vertex or an edge.
Syntax: properties(<vertex_or_edge>)
Example:
nebula> MATCH (v:player)-[e:follow]-() RETURN properties(v),properties(e);\n+---------------------------------------+---------------+\n| properties(v) | properties(e) |\n+---------------------------------------+---------------+\n| {age: 31, name: \"Stephen Curry\"} | {degree: 90} |\n| {age: 47, name: \"Shaquille O'Neal\"} | {degree: 100} |\n| {age: 34, name: \"LeBron James\"} | {degree: 13} |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#type","title":"type()","text":"type() returns the edge type of an edge.
Syntax: type(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN type(e);\n+----------+\n| type(e) |\n+----------+\n| \"serve\" |\n| \"follow\" |\n| \"follow\" |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#typeid","title":"typeid()","text":"typeid() returns the internal ID value of the Edge type of the edge, which can be used to determine the direction by positive or negative.
Syntax: typeid(<edge>)
Example:
nebula> MATCH (v:player)-[e:follow]-(v2) RETURN e,typeid(e), \\\n CASE WHEN typeid(e) > 0 \\\n THEN \"Forward\" ELSE \"Reverse\" END AS direction \\\n LIMIT 5;\n+----------------------------------------------------+-----------+-----------+\n| e | typeid(e) | direction |\n+----------------------------------------------------+-----------+-----------+\n| [:follow \"player127\"->\"player114\" @0 {degree: 90}] | 5 | \"Forward\" |\n| [:follow \"player127\"->\"player148\" @0 {degree: 70}] | 5 | \"Forward\" |\n| [:follow \"player148\"->\"player127\" @0 {degree: 80}] | -5 | \"Reverse\" |\n| [:follow \"player147\"->\"player136\" @0 {degree: 90}] | 5 | \"Forward\" |\n| [:follow \"player136\"->\"player147\" @0 {degree: 90}] | -5 | \"Reverse\" |\n+----------------------------------------------------+-----------+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#src","title":"src()","text":"src() returns the source vertex ID of an edge.
Syntax: src(<edge>)
Example:
nebula> MATCH ()-[e]->(v:player{name:\"Tim Duncan\"}) \\\n RETURN src(e);\n+-------------+\n| src(e) |\n+-------------+\n| \"player125\" |\n| \"player113\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#dst","title":"dst()","text":"dst() returns the destination vertex ID of an edge.
Syntax: dst(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN dst(e);\n+-------------+\n| dst(e) |\n+-------------+\n| \"team204\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#startnode","title":"startNode()","text":"startNode() visits a path and returns its information of source vertex ID, including VIDs, tags, properties, and values.
Syntax: startNode(<path>)
Example:
nebula> MATCH p = (a :player {name : \"Tim Duncan\"})-[r:serve]-(t) \\\n RETURN startNode(p);\n+----------------------------------------------------+\n| startNode(p) |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#endnode","title":"endNode()","text":"endNode() visits a path and returns its information of destination vertex ID, including VIDs, tags, properties, and values.
Syntax: endNode(<path>)
Example:
nebula> MATCH p = (a :player {name : \"Tim Duncan\"})-[r:serve]-(t) \\\n RETURN endNode(p);\n+----------------------------------+\n| endNode(p) |\n+----------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#rank","title":"rank()","text":"rank() returns the rank value of an edge.
Syntax: rank(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN rank(e);\n+---------+\n| rank(e) |\n+---------+\n| 0 |\n| 0 |\n| 0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/","title":"Conditional expressions","text":"This topic describes the conditional functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#case","title":"CASE","text":"The CASE
expression uses conditions to filter the parameters. nGQL provides two forms of CASE
expressions just like openCypher: the simple form and the generic form.
The CASE
expression will traverse all the conditions. When the first condition is met, the CASE
expression stops reading the conditions and returns the result. If no conditions are met, it returns the result in the ELSE
clause. If there is no ELSE
clause and no conditions are met, it returns NULL
.
CASE <comparer>\nWHEN <value> THEN <result>\n[WHEN ...]\n[ELSE <default>]\nEND\n
Caution
Always remember to end the CASE
expression with an END
.
comparer
A value or a valid expression that outputs a value. This value is used to compare with the value
. value
It will be compared with the comparer
. If the value
matches the comparer
, then this condition is met. result
The result
is returned by the CASE
expression if the value
matches the comparer
. default
The default
is returned by the CASE
expression if no conditions are met. nebula> RETURN \\\n CASE 2+3 \\\n WHEN 4 THEN 0 \\\n WHEN 5 THEN 1 \\\n ELSE -1 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Name, \\\n CASE properties($$).age > 35 \\\n WHEN true THEN \"Yes\" \\\n WHEN false THEN \"No\" \\\n ELSE \"Nah\" \\\n END \\\n AS Age_above_35;\n+-----------------+--------------+\n| Name | Age_above_35 |\n+-----------------+--------------+\n| \"Tony Parker\" | \"Yes\" |\n| \"Manu Ginobili\" | \"Yes\" |\n+-----------------+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#the_generic_form_of_case_expressions","title":"The generic form of CASE expressions","text":"CASE\nWHEN <condition> THEN <result>\n[WHEN ...]\n[ELSE <default>]\nEND\n
Parameter Description condition
If the condition
is evaluated as true, the result
is returned by the CASE
expression. result
The result
is returned by the CASE
expression if the condition
is evaluated as true. default
The default
is returned by the CASE
expression if no conditions are met. nebula> YIELD \\\n CASE WHEN 4 > 5 THEN 0 \\\n WHEN 3+4==7 THEN 1 \\\n ELSE 2 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n
nebula> MATCH (v:player) WHERE v.player.age > 30 \\\n RETURN v.player.name AS Name, \\\n CASE \\\n WHEN v.player.name STARTS WITH \"T\" THEN \"Yes\" \\\n ELSE \"No\" \\\n END \\\n AS Starts_with_T;\n+---------------------+---------------+\n| Name | Starts_with_T |\n+---------------------+---------------+\n| \"Tim Duncan\" | \"Yes\" |\n| \"LaMarcus Aldridge\" | \"No\" |\n| \"Tony Parker\" | \"Yes\" |\n+---------------------+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#differences_between_the_simple_form_and_the_generic_form","title":"Differences between the simple form and the generic form","text":"To avoid the misuse of the simple form and the generic form, it is important to understand their differences. The following example can help explain them.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Name, properties($$).age AS Age, \\\n CASE properties($$).age \\\n WHEN properties($$).age > 35 THEN \"Yes\" \\\n ELSE \"No\" \\\n END \\\n AS Age_above_35;\n+-----------------+-----+--------------+\n| Name | Age | Age_above_35 |\n+-----------------+-----+--------------+\n| \"Tony Parker\" | 36 | \"No\" |\n| \"Manu Ginobili\" | 41 | \"No\" |\n+-----------------+-----+--------------+\n
The preceding GO
query is intended to output Yes
when the player's age is above 35. However, in this example, when the player's age is 36, the actual output is not as expected: It is No
instead of Yes
.
This is because the query uses the CASE
expression in the simple form, and a comparison between the values of $$.player.age
and $$.player.age > 35
is made. When the player age is 36:
$$.player.age
is 36
. It is an integer.$$.player.age > 35
is evaluated to be true
. It is a boolean.The values of $$.player.age
and $$.player.age > 35
do not match. Therefore, the condition is not met and No
is returned.
coalesce() returns the first not null value in all expressions.
Syntax: coalesce(<expression_1>[,<expression_2>...])
Example:
nebula> RETURN coalesce(null,[1,2,3]) as result;\n+-----------+\n| result |\n+-----------+\n| [1, 2, 3] |\n+-----------+\nnebula> RETURN coalesce(null) as result;\n+----------+\n| result |\n+----------+\n| __NULL__ |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/","title":"List functions","text":"This topic describes the list functions supported by NebulaGraph. Some of the functions have different syntax in native nGQL statements and openCypher-compatible statements.
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#precautions","title":"Precautions","text":"Like SQL, the position index in nGQL starts from 1
, while in the C language it starts from 0
.
range() returns the list containing all the fixed-length steps in [start,end]
.
Syntax: range(start, end [, step])
step
: Optional parameters. step
is 1 by default.Example:
nebula> RETURN range(1,9,2);\n+-----------------+\n| range(1,9,2) |\n+-----------------+\n| [1, 3, 5, 7, 9] |\n+-----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#reverse","title":"reverse()","text":"reverse() returns the list reversing the order of all elements in the original list.
Syntax: reverse(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN reverse(ids);\n+-----------------------------------+\n| reverse(ids) |\n+-----------------------------------+\n| [487, 521, \"abc\", 4923, __NULL__] |\n+-----------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#tail","title":"tail()","text":"tail() returns all the elements of the original list, excluding the first one.
Syntax: tail(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN tail(ids);\n+-------------------------+\n| tail(ids) |\n+-------------------------+\n| [4923, \"abc\", 521, 487] |\n+-------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#head","title":"head()","text":"head() returns the first element of a list.
Syntax: head(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN head(ids);\n+-----------+\n| head(ids) |\n+-----------+\n| __NULL__ |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#last","title":"last()","text":"last() returns the last element of a list.
Syntax: last(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN last(ids);\n+-----------+\n| last(ids) |\n+-----------+\n| 487 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#reduce","title":"reduce()","text":"reduce() applies an expression to each element in a list one by one, chains the result to the next iteration by taking it as the initial value, and returns the final result. This function iterates each element e
in the given list, runs the expression on e
, accumulates the result with the initial value, and store the new result in the accumulator as the initial value of the next iteration. It works like the fold or reduce method in functional languages such as Lisp and Scala.
openCypher compatibility
In openCypher, the reduce()
function is not defined. nGQL will implement the reduce()
function in the Cypher way.
Syntax: reduce(<accumulator> = <initial>, <variable> IN <list> | <expression>)
accumulator
: A variable that will hold the accumulated results as the list is iterated.initial
: An expression that runs once to give an initial value to the accumulator
.variable
: A variable in the list that will be applied to the expression successively.list
: A list or a list of expressions.expression
: This expression will be run on each element in the list once and store the result value in the accumulator
.Example:
nebula> RETURN reduce(totalNum = -4 * 5, n IN [1, 2] | totalNum + n * 2) AS r;\n+-----+\n| r |\n+-----+\n| -14 |\n+-----+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})<-[:follow]-(m) \\\n RETURN nodes(p)[0].player.age AS src1, nodes(p)[1].player.age AS dst2, \\\n reduce(totalAge = 100, n IN nodes(p) | totalAge + n.player.age) AS sum;\n+------+------+-----+\n| src1 | dst2 | sum |\n+------+------+-----+\n| 34 | 31 | 165 |\n| 34 | 29 | 163 |\n| 34 | 33 | 167 |\n| 34 | 26 | 160 |\n| 34 | 34 | 168 |\n| 34 | 37 | 171 |\n+------+------+-----+\n\nnebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD id(vertex) AS VertexID \\\n | GO FROM $-.VertexID over follow \\\n WHERE properties(edge).degree != reduce(totalNum = 5, n IN range(1, 3) | properties($$).age + totalNum + n) \\\n YIELD properties($$).name AS id, properties($$).age AS age, properties(edge).degree AS degree;\n+---------------------+-----+--------+\n| id | age | degree |\n+---------------------+-----+--------+\n| \"Tim Duncan\" | 42 | 95 |\n| \"LaMarcus Aldridge\" | 33 | 90 |\n| \"Manu Ginobili\" | 41 | 95 |\n+---------------------+-----+--------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#for_ngql_statements","title":"For nGQL statements","text":""},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#keys","title":"keys()","text":"keys() returns a list containing the string representations for all the property names of vertices or edges.
Syntax: keys({vertex | edge})
Example:
nebula> LOOKUP ON player \\\n WHERE player.age > 45 \\\n YIELD keys(vertex);\n+-----------------+\n| keys(VERTEX) |\n+-----------------+\n| [\"age\", \"name\"] |\n| [\"age\", \"name\"] |\n+-----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#labels","title":"labels()","text":"labels() returns the list containing all the tags of a vertex.
Syntax: labels(verte)
Example:
nebula> FETCH PROP ON * \"player101\", \"player102\", \"team204\" \\\n YIELD labels(vertex);\n+----------------+\n| labels(VERTEX) |\n+----------------+\n| [\"player\"] |\n| [\"player\"] |\n| [\"team\"] |\n+----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#for_statements_compatible_with_opencypher","title":"For statements compatible with openCypher","text":""},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#keys_1","title":"keys()","text":"keys() returns a list containing the string representations for all the property names of vertices, edges, or maps.
Syntax: keys(<vertex_or_edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN keys(e);\n+----------------------------+\n| keys(e) |\n+----------------------------+\n| [\"end_year\", \"start_year\"] |\n| [\"degree\"] |\n| [\"degree\"] |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#labels_1","title":"labels()","text":"labels() returns the list containing all the tags of a vertex.
Syntax: labels(<vertex>)
Example:
nebula> MATCH (v)-[e:serve]->() \\\n WHERE id(v)==\"player100\" \\\n RETURN labels(v);\n+------------+\n| labels(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#nodes","title":"nodes()","text":"nodes() returns the list containing all the vertices in a path.
Syntax: nodes(<path>)
Example:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN nodes(p);\n+-------------------------------------------------------------------------------------------------------------+\n| nodes(p) |\n+-------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"team204\" :team{name: \"Spurs\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player125\" :player{age: 41, name: \"Manu Ginobili\"})] |\n+-------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#relationships","title":"relationships()","text":"relationships() returns the list containing all the relationships in a path.
Syntax: relationships(<path>)
Example:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN relationships(p);\n+-------------------------------------------------------------------------+\n| relationships(p) |\n+-------------------------------------------------------------------------+\n| [[:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}]] |\n| [[:follow \"player100\"->\"player101\" @0 {degree: 95}]] |\n| [[:follow \"player100\"->\"player125\" @0 {degree: 95}]] |\n+-------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/8.predicate/","title":"Predicate functions","text":"Predicate functions return true
or false
. They are most commonly used in WHERE
clauses.
NebulaGraph supports the following predicate functions:
Functions Description exists() Returnstrue
if the specified property exists in the vertex, edge or map. Otherwise, returns false
. any() Returns true
if the specified predicate holds for at least one element in the given list. Otherwise, returns false
. all() Returns true
if the specified predicate holds for all elements in the given list. Otherwise, returns false
. none() Returns true
if the specified predicate holds for no element in the given list. Otherwise, returns false
. single() Returns true
if the specified predicate holds for exactly one of the elements in the given list. Otherwise, returns false
. Note
NULL is returned if the list is NULL or all of its elements are NULL.
Compatibility
In openCypher, only function exists()
is defined and specified. The other functions are implement-dependent.
<predicate>(<variable> IN <list> WHERE <condition>)\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/8.predicate/#examples","title":"Examples","text":"nebula> RETURN any(n IN [1, 2, 3, 4, 5, NULL] \\\n WHERE n > 2) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> RETURN single(n IN range(1, 5) \\\n WHERE n == 3) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> RETURN none(n IN range(1, 3) \\\n WHERE n == 0) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> WITH [1, 2, 3, 4, 5, NULL] AS a \\\n RETURN any(n IN a WHERE n > 2);\n+-------------------------+\n| any(n IN a WHERE (n>2)) |\n+-------------------------+\n| true |\n+-------------------------+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})<-[:follow]-(m) \\\n RETURN nodes(p)[0].player.name AS n1, nodes(p)[1].player.name AS n2, \\\n all(n IN nodes(p) WHERE n.player.name NOT STARTS WITH \"D\") AS b;\n+----------------+-------------------+-------+\n| n1 | n2 | b |\n+----------------+-------------------+-------+\n| \"LeBron James\" | \"Danny Green\" | false |\n| \"LeBron James\" | \"Dejounte Murray\" | false |\n| \"LeBron James\" | \"Chris Paul\" | true |\n| \"LeBron James\" | \"Kyrie Irving\" | true |\n| \"LeBron James\" | \"Carmelo Anthony\" | true |\n| \"LeBron James\" | \"Dwyane Wade\" | false |\n+----------------+-------------------+-------+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})-[:follow]->(m) \\\n RETURN single(n IN nodes(p) WHERE n.player.age > 40) AS b;\n+------+\n| b |\n+------+\n| true |\n+------+\n\nnebula> MATCH (n:player) \\\n RETURN exists(n.player.id), n IS NOT NULL;\n+---------------------+---------------+\n| exists(n.player.id) | n IS NOT NULL |\n+---------------------+---------------+\n| false | true |\n...\n\nnebula> MATCH (n:player) \\\n WHERE exists(n['name']) RETURN n;\n+-------------------------------------------------------------------------------------------------------------+\n| n |\n+-------------------------------------------------------------------------------------------------------------+\n| (\"Grant Hill\" :player{age: 46, name: \"Grant Hill\"}) |\n| (\"Marc Gasol\" :player{age: 34, name: \"Marc Gasol\"}) |\n+-------------------------------------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/","title":"Overview of NebulaGraph general query statements","text":"This topic provides an overview of the general categories of query statements in NebulaGraph and outlines their use cases.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#background","title":"Background","text":"NebulaGraph stores data in the form of vertices and edges. Each vertex can have zero or more tags and each edge has exactly one edge type. Tags define the type of a vertex and describe its properties, while edge types define the type of an edge and describe its properties. When querying, you can limit the scope of the query by specifying the tag of a vertex or the type of an edge. For more information, see Patterns.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#categories","title":"Categories","text":"The primary query statements in NebulaGraph fall into the following categories:
FETCH PROP ON
and LOOKUP ON
statements are primarily for basic data queries, GO
and MATCH
for more intricate queries and graph traversals, FIND PATH
and GET SUBGRAPH
for path and subgraph queries, and SHOW
for retrieving database metadata.
Usage: Retrieve properties of a specified vertex or edge.
Use case: Knowing the specific vertex or edge ID and wanting to retrieve its properties.
Note:
YIELD
clause to specify the returned properties.Example:
FETCH PROP ON player \"player100\" YIELD properties(vertex);\n --+--- ----+----- -------+----------\n | | |\n | | |\n | | +--------- Returns all properties under the player tag of the vertex.\n | |\n | +----------------- Retrieves from the vertex \"player100\".\n |\n +--------------------------- Retrieves properties under the player tag.\n
For more information, see FETCH PROP ON.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#lookup_on","title":"LOOKUP ON","text":"Usage: Index-based querying of vertex or edge IDs.
Use case: Finding vertex or edge IDs based on property values.
Note: - Must pre-define indexes for the tag, edge type, or property. - Must specify the tag of the vertex or the edge type of the edge. - Must use the YIELD
clause to specify the returned IDs.
Example:
LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD id(vertex);\n --+--- ------------------+--------------- ---+------\n | | |\n | | |\n | | +---- Returns the VID of the retrieved vertex.\n | |\n | +------------ Filtering is based on the value of the property name.\n |\n +----------------------------------- Queries based on the player tag.\n
For more information, see LOOKUP ON.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#go","title":"GO","text":"Usage: Traverse the graph based on a given vertex and return information about the starting vertex, edges, or target vertices as needed. Use case: Complex graph traversals, such as finding friends of a vertex, friends' friends, etc.
Note: - Use property reference symbols ($^
and $$
) to return properties of the starting or target vertices, e.g., YIELD $^.player.name
. - Use the functions properties($^)
and properties($$)
to return all properties of the starting or target vertices. Specify property names in the function to return specific properties, e.g., YIELD properties($^).name
. - Use the functions src(edge)
and dst(edge)
to return the starting or destination vertex ID of an edge, e.g., YIELD src(edge)
.
Example:
GO 3 STEPS FROM \"player102\" OVER follow YIELD dst(edge);\n-----+--- --+------- -+---- ---+-----\n | | | |\n | | | |\n | | | +--------- Returns the destination vertex of the last hop.\n | | |\n | | +------ Traverses out via the edge follow.\n | |\n | +--------------------- Starts from \"player102\".\n |\n +---------------------------------- Traverses 3 steps.\n
For more information, see GO.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#match","title":"MATCH","text":"Usage: Execute complex graph pattern matching queries.
Use case: Complex graph pattern matching, such as finding combinations of vertices and edges that satisfy a specific pattern.
Note:
MATCH
statements are compatible with the OpenCypher syntax but with some differences:
==
for equality instead of =
, e.g., WHERE player.name == \"Tony Parker\"
.YIELD player.name
.WHERE id(v) == \"player100\"
syntax.RETURN
clause to specify what information to return.Example:
MATCH (v:player{name:\"Tim Duncan\"})-->(v2:player) \\\n RETURN v2.player.name AS Name;\n
For more information, see MATCH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#find_path","title":"FIND PATH","text":"Usage: Query paths between given starting and target vertices or query properties of vertices and edges along paths.
Use case: Querying paths between two vertices.
Note: Must use the YIELD
clause to specify returned information.
Example:
FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path AS p;\n-------+----- -------+---------------- ---+-- ----+----\n | | | |\n | | | |\n | | | +---------- Returns the path as 'p'.\n | | |\n | | +----------- Travels outwards via all types of edges.\n | | \n | |\n | +------------------ From the given starting and target VIDs. \n |\n +--------------------------- Retrieves the shortest path.\n
For more information, see FIND PATH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#get_subgraph","title":"GET SUBGRAPH","text":"Usage: Extract a portion of the graph that satisfies specific conditions or query properties of vertices and edges in the subgraph.
Use case: Analyzing structures of the graph or specific regions, such as extracting the social network subgraph of a person or the transportation network subgraph of an area.
Note: Must use the YIELD
clause to specify returned information.
Example:
GET SUBGRAPH 5 STEPS FROM \"player101\" YIELD VERTICES AS nodes, EDGES AS relationships;\n -----+- -----+-------- ------------------------+----------------\n | | |\n | | |\n | +------- Starts from \"player101\". +------------ Returns all vertices and edges.\n |\n +----------------- Gets exploration of 5 steps \n
For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#show","title":"SHOW","text":"SHOW
statements are mainly used to obtain metadata information from the database, not for retrieving the actual data stored in the database. These statements are typically used to query the structure and configuration of the database.
SHOW CHARSET
SHOW CHARSET
Shows the available character sets. SHOW COLLATION SHOW COLLATION
SHOW COLLATION
Shows the collations supported by NebulaGraph. SHOW CREATE SPACE SHOW CREATE SPACE <space_name>
SHOW CREATE SPACE basketballplayer
Shows the creating statement of the specified graph space. SHOW CREATE TAG/EDGE SHOW CREATE {TAG <tag_name> | EDGE <edge_name>}
SHOW CREATE TAG player
Shows the basic information of the specified tag. SHOW HOSTS SHOW HOSTS [GRAPH | STORAGE | META]
SHOW HOSTS
SHOW HOSTS GRAPH
Shows the host and version information of Graph Service, Storage Service, and Meta Service. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the names of existing native indexes. SHOW PARTS SHOW PARTS [<part_id>]
SHOW PARTS
Shows the information of a specified partition or all partitions in a graph space. SHOW ROLES SHOW ROLES IN <space_name>
SHOW ROLES in basketballplayer
Shows the roles that are assigned to a user account. SHOW SNAPSHOTS SHOW SNAPSHOTS
SHOW SNAPSHOTS
Shows the information of all the snapshots. SHOW SPACES SHOW SPACES
SHOW SPACES
Shows existing graph spaces in NebulaGraph. SHOW STATS SHOW STATS
SHOW STATS
Shows the statistics of the graph space collected by the latest STATS
job. SHOW TAGS/EDGES SHOW TAGS | EDGES
SHOW TAGS
,SHOW EDGES
Shows all the tags in the current graph space. SHOW USERS SHOW USERS
SHOW USERS
Shows the user information. SHOW SESSIONS SHOW SESSIONS
SHOW SESSIONS
Shows the information of all the sessions. SHOW SESSIONS SHOW SESSION <Session_Id>
SHOW SESSION 1623304491050858
Shows a specified session with its ID. SHOW QUERIES SHOW [ALL] QUERIES
SHOW QUERIES
Shows the information of working queries in the current session. SHOW META LEADER SHOW META LEADER
SHOW META LEADER
Shows the information of the leader in the current Meta cluster."},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#compound_queries","title":"Compound queries","text":"Query statements in NebulaGraph can be combined to achieve more complex queries.
When referencing the results of a subquery in a compound statement, you need to create an alias for the result and use the pipe symbol(|
) to pass it to the next subquery. Use $-
in the next subquery to reference the alias of that result. See Pipe Symbol for details.
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dstid, properties($$).name AS Name | \\\n GO FROM $-.dstid OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n| \"player100\" |\n+-------------+\n
The pipe symbol |
is applicable only in nGQL and cannot be used in OpenCypher statements. If you need to perform compound queries using MATCH
statements, you can use the WITH clause.
Example:
nebula> MATCH (v:player)-->(v2:player) \\\n WITH DISTINCT v2 AS v2, v2.player.age AS Age \\\n ORDER BY Age \\\n WHERE Age<25 \\\n RETURN v2.player.name AS Name, Age;\n+----------------------+-----+\n| Name | Age |\n+----------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+-----+\n
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#more_information","title":"More information","text":"nGQL command cheatsheet
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/","title":"MATCH","text":"The MATCH
statement provides pattern-based search functionality, allowing you to retrieve data that matches one or more patterns in NebulaGraph. By defining one or more patterns, you can search for data that matches the patterns in NebulaGraph. Once the matching data is retrieved, you can use the RETURN
clause to return it as a result.
The examples in this topic use the basketballplayer dataset as the sample dataset.
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#syntax","title":"Syntax","text":"The syntax of MATCH
is relatively more flexible compared with that of other query statements such as GO
or LOOKUP
. The path type of the MATCH
statement is trail
. That is, only vertices can be repeatedly visited in the graph traversal. Edges cannot be repeatedly visited. For details, see path. But generally, it can be summarized as follows.
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
pattern
: The MATCH
statement supports matching one or multiple patterns. Multiple patterns are separated by commas (,). For example: (a)-[]->(b),(c)-[]->(d)
. For the detailed description of patterns, see Patterns. clause_1
: The WHERE
, WITH
, UNWIND
, and OPTIONAL MATCH
clauses are supported, and the MATCH
clause can also be used.output
: Define the list name for the output results to be returned. You can use AS
to set an alias for the list.clause_2
: The ORDER BY
and LIMIT
clauses are supported.Legacy version compatibility
MATCH
statement supports full table scans. It can traverse vertices or edges in the graph without using any indexes or filter conditions. In previous versions, the MATCH
statement required an index for certain queries or needed to use LIMIT
to restrict the number of output results.RETURN <variable_name>.<property_name>
is changed to RETURN <variable_name>.<tag_name>.<property_name>
.v:player
and v.player.name
in the statement MATCH (v:player) RETURN v.player.name AS Name
.player
tag or the name property of the player
tag. For more information about the usage and considerations for indexes, see Must-read for using indexes.MATCH
statement cannot query dangling edges.You can use a user-defined variable in a pair of parentheses to represent a vertex in a pattern. For example: (v)
.
nebula> MATCH (v) \\\n RETURN v \\\n LIMIT 3;\n+-----------------------------------------------------------+\n| v |\n+-----------------------------------------------------------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n| (\"player106\" :player{age: 25, name: \"Kyle Anderson\"}) |\n| (\"player115\" :player{age: 40, name: \"Kobe Bryant\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_tags","title":"Match tags","text":"Legacy version compatibility
LIMIT
to restrict the number of output results.MATCH
statement supports full table scans. There is no need to create an index for a tag or a specific property of a tag, nor use LIMIT
to restrict the number of output results in order to execute the MATCH
statement.You can specify a tag with :<tag_name>
after the vertex in a pattern.
nebula> MATCH (v:player) \\\n RETURN v;\n+---------------------------------------------------------------+\n| v |\n+---------------------------------------------------------------+\n| (\"player105\" :player{age: 31, name: \"Danny Green\"}) |\n| (\"player109\" :player{age: 34, name: \"Tiago Splitter\"}) |\n| (\"player111\" :player{age: 38, name: \"David West\"}) |\n...\n
To match vertices with multiple tags, use colons (:).
nebula> CREATE TAG actor (name string, age int);\nnebula> INSERT VERTEX actor(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\nnebula> MATCH (v:player:actor) \\\n RETURN v \\\n+----------------------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------------------+\n| (\"player100\" :actor{age: 42, name: \"Tim Duncan\"} :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_vertex_properties","title":"Match vertex properties","text":"Note
The prerequisite for matching a vertex property is that the tag itself has an index of the corresponding property. Otherwise, you cannot execute the MATCH
statement to match the property.
You can specify a vertex property with {<prop_name>: <prop_value>}
after the tag in a pattern.
# The following example uses the name property to match a vertex.\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
The WHERE
clause can do the same thing:
nebula> MATCH (v:player) \\\n WHERE v.player.name == \"Tim Duncan\" \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
OpenCypher compatibility
In openCypher 9, =
is the equality operator. However, in nGQL, ==
is the equality operator and =
is the assignment operator (as in C++ or Java).
Use the WHERE
clause to directly get all the vertices with the vertex property value Tim Duncan.
nebula> MATCH (v) \\\n WITH v, properties(v) as props, keys(properties(v)) as kk \\\n WHERE [i in kk where props[i] == \"Tim Duncan\"] \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n\nnebula> WITH ['Tim Duncan', 'Yao Ming'] AS names \\\n MATCH (v1:player)-->(v2:player) \\\n WHERE v1.player.name in names \\\n return v1, v2;\n+----------------------------------------------------+----------------------------------------------------------+\n| v1 | v2 |\n+----------------------------------------------------+----------------------------------------------------------+\n| (\"player133\" :player{age: 38, name: \"Yao Ming\"}) | (\"player114\" :player{age: 39, name: \"Tracy McGrady\"}) |\n| (\"player133\" :player{age: 38, name: \"Yao Ming\"}) | (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n+----------------------------------------------------+----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_vids","title":"Match VIDs","text":"You can use the VID to match a vertex. The id()
function can retrieve the VID of a vertex.
nebula> MATCH (v) \\\n WHERE id(v) == 'player101' \\\n RETURN v;\n+-----------------------------------------------------+\n| v |\n+-----------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+-----------------------------------------------------+\n
To match multiple VIDs, use WHERE id(v) IN [vid_list]
or WHERE id(v) IN {vid_list}
.
nebula> MATCH (v:player { name: 'Tim Duncan' })--(v2) \\\n WHERE id(v2) IN [\"player101\", \"player102\"] \\\n RETURN v2;\n+-----------------------------------------------------------+\n| v2 |\n+-----------------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n\nnebula> MATCH (v) WHERE id(v) IN {\"player100\", \"player101\"} \\\n RETURN v.player.name AS name;\n+---------------+\n| name |\n+---------------+\n| \"Tony Parker\" |\n| \"Tim Duncan\" |\n+---------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_connected_vertices","title":"Match connected vertices","text":"You can use the --
symbol to represent edges of both directions and match vertices connected by these edges.
Legacy version compatibility
In nGQL 1.x, the --
symbol is used for inline comments. Starting from nGQL 2.x, the --
symbol represents an incoming or outgoing edge.
nebula> MATCH (v:player{name:\"Tim Duncan\"})--(v2) \\\n RETURN v2.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Manu Ginobili\" |\n| \"Manu Ginobili\" |\n| \"Tiago Splitter\" |\n...\n
You can add a >
or <
to the --
symbol to specify the direction of an edge.
In the following example, -->
represents an edge that starts from v
and points to v2
. To v
, this is an outgoing edge, and to v2
this is an incoming edge.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->(v2:player) \\\n RETURN v2.player.name AS Name;\n+-----------------+\n| Name |\n+-----------------+\n| \"Manu Ginobili\" |\n| \"Tony Parker\" |\n+-----------------+\n
To query the properties of the target vertices, use the CASE
expression.
nebula> MATCH (v:player{name:\"Tim Duncan\"})--(v2) \\\n RETURN \\\n CASE WHEN v2.team.name IS NOT NULL \\\n THEN v2.team.name \\\n WHEN v2.player.name IS NOT NULL \\\n THEN v2.player.name END AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Manu Ginobili\" |\n| \"Manu Ginobili\" |\n| \"Spurs\" |\n| \"Dejounte Murray\" |\n...\n
To extend the pattern, you can add more vertices and edges.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->(v2)<--(v3) \\\n RETURN v3.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Dejounte Murray\" |\n| \"LaMarcus Aldridge\" |\n| \"Marco Belinelli\" |\n...\n
If you do not need to refer to a vertex, you can omit the variable representing it in the parentheses.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->()<--(v3) \\\n RETURN v3.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Dejounte Murray\" |\n| \"LaMarcus Aldridge\" |\n| \"Marco Belinelli\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_paths","title":"Match paths","text":"Connected vertices and edges form a path. You can use a user-defined variable to name a path as follows.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN p;\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n
OpenCypher compatibility
In nGQL, the @
symbol represents the rank of an edge, but openCypher has no such concept.
nebula> MATCH ()<-[e]-() \\\n RETURN e \\\n LIMIT 3;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player103\"->\"player102\" @0 {degree: 70}] |\n| [:follow \"player135\"->\"player102\" @0 {degree: 80}] |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_edge_types","title":"Match edge types","text":"Just like vertices, you can specify edge types with :<edge_type>
in a pattern. For example: -[e:follow]-
.
OpenCypher compatibility
LIMIT
to limit the number of output results and you must specify the direction of the edge.MATCH
statement to match edges without creating an index for edge type or using LIMIT
to restrict the number of output results.nebula> MATCH ()-[e:follow]->() \\\n RETURN e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player102\"->\"player100\" @0 {degree: 75}] |\n| [:follow \"player102\"->\"player101\" @0 {degree: 75}] |\n| [:follow \"player129\"->\"player116\" @0 {degree: 90}] |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_edge_type_properties","title":"Match edge type properties","text":"Note
The prerequisite for matching an edge type property is that the edge type itself has an index of the corresponding property. Otherwise, you cannot execute the MATCH
statement to match the property.
You can specify edge type properties with {<prop_name>: <prop_value>}
in a pattern. For example: [e:follow{likeness:95}]
.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e:follow{degree:95}]->(v2) \\\n RETURN e;\n+--------------------------------------------------------+\n| e |\n+--------------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+--------------------------------------------------------+\n
Use the WHERE
clause to directly get all the edges with the edge property value 90.
nebula> MATCH ()-[e]->() \\\n WITH e, properties(e) as props, keys(properties(e)) as kk \\\n WHERE [i in kk where props[i] == 90] \\\n RETURN e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player125\"->\"player100\" @0 {degree: 90}] |\n| [:follow \"player140\"->\"player114\" @0 {degree: 90}] |\n| [:follow \"player133\"->\"player144\" @0 {degree: 90}] |\n| [:follow \"player133\"->\"player114\" @0 {degree: 90}] |\n...\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_edge_types","title":"Match multiple edge types","text":"The |
symbol can help matching multiple edge types. For example: [e:follow|:serve]
. The English colon (:) before the first edge type cannot be omitted, but the English colon before the subsequent edge type can be omitted, such as [e:follow|serve]
.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e:follow|:serve]->(v2) \\\n RETURN e;\n+---------------------------------------------------------------------------+\n| e |\n+---------------------------------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n+---------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_edges","title":"Match multiple edges","text":"You can extend a pattern to match multiple edges in a path.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[]->(v2)<-[e:serve]-(v3) \\\n RETURN v2, v3;\n+----------------------------------+-----------------------------------------------------------+\n| v2 | v3 |\n+----------------------------------+-----------------------------------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_fixed-length_paths","title":"Match fixed-length paths","text":"You can use the :<edge_type>*<hop>
pattern to match a fixed-length path. hop
must be a non-negative integer.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n RETURN DISTINCT v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n
If hop
is 0, the pattern will match the source vertex of the path.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) -[*0]-> (v2) \\\n RETURN v2;\n+----------------------------------------------------+\n| v2 |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
Note
When you conditionally filter on multi-hop edges, such as -[e:follow*2]->
, note that the e
is a list of edges instead of a single edge.
For example, the following statement is correct from the syntax point of view which may not get your expected query result, because the e
is a list without the .degree
property.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE e.degree > 1 \\\n RETURN DISTINCT v2 AS Friends;\n
The correct statement is as follows:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE ALL(e_ in e WHERE e_.degree > 0) \\\n RETURN DISTINCT v2 AS Friends;\n
Further, the following statement is for filtering the properties of the first-hop edge in multi-hop edges:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE e[0].degree > 98 \\\n RETURN DISTINCT v2 AS Friends;\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_variable-length_paths","title":"Match variable-length paths","text":"You can use the :<edge_type>*[minHop..maxHop]
pattern to match variable-length paths.minHop
and maxHop
are optional and default to 1 and infinity respectively.
Caution
If maxHop
is not set, it may cause the Graph service to OOM. Execute this command with caution.
minHop
Optional. minHop
indicates the minimum length of the path, which must be a non-negative integer. The default value is 1. maxHop
Optional. maxHop
indicates the maximum length of the path, which must be a non-negative integer. The default value is infinity. If neither minHop
nor maxHop
is specified, and only :<edge_type>*
is set, the default values are applied to both, i.e., minHop
is 1 and maxHop
is infinity.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n...\n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n...\n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n...\n
You can use the DISTINCT
keyword to aggregate duplicate results.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2:player) \\\n RETURN DISTINCT v2 AS Friends, count(v2);\n+-----------------------------------------------------------+-----------+\n| Friends | count(v2) |\n+-----------------------------------------------------------+-----------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) | 1 |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | 4 |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | 3 |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) | 3 |\n+-----------------------------------------------------------+-----------+\n
If minHop
is 0
, the pattern will match the source vertex of the path. Compared to the preceding statement, the following example uses 0
as the minHop
. So in the following result set, \"Tim Duncan\"
is counted one more time than it is in the preceding result set because it is the source vertex.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*0..3]->(v2:player) \\\n RETURN DISTINCT v2 AS Friends, count(v2);\n+-----------------------------------------------------------+-----------+\n| Friends | count(v2) |\n+-----------------------------------------------------------+-----------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) | 1 |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | 5 |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) | 3 |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | 3 |\n+-----------------------------------------------------------+-----------+\n
Note
When using the variable e
to match fixed-length or variable-length paths in a pattern, such as -[e:follow*0..3]->
, it is not supported to reference e
in other patterns. For example, the following statement is not supported.
nebula> MATCH (v:player)-[e:like*1..3]->(n) \\\n WHERE (n)-[e*1..4]->(:player) \\\n RETURN v;\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_variable-length_paths_with_multiple_edge_types","title":"Match variable-length paths with multiple edge types","text":"You can specify multiple edge types in a fixed-length or variable-length pattern. In this case, hop
, minHop
, and maxHop
take effect on all edge types.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow|serve*2]->(v2) \\\n RETURN DISTINCT v2;\n+-----------------------------------------------------------+\n| v2 |\n+-----------------------------------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"team215\" :team{name: \"Hornets\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_patterns","title":"Match multiple patterns","text":"You can separate multiple patterns with commas (,).
nebula> CREATE TAG INDEX IF NOT EXISTS team_index ON team(name(20));\nnebula> REBUILD TAG INDEX team_index;\nnebula> MATCH (v1:player{name:\"Tim Duncan\"}), (v2:team{name:\"Spurs\"}) \\\n RETURN v1,v2;\n+----------------------------------------------------+----------------------------------+\n| v1 | v2 |\n+----------------------------------------------------+----------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------------------------+----------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_shortest_paths","title":"Match shortest paths","text":"The allShortestPaths
function can be used to find all shortest paths between two vertices.
nebula> MATCH p = allShortestPaths((a:player{name:\"Tim Duncan\"})-[e*..5]-(b:player{name:\"Tony Parker\"})) \\\n RETURN p;\n+------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})<-[:follow@0 {degree: 95}]-(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+------------------------------------------------------------------------------------------------------------------------------------+\n
The shortestPath
function can be used to find a single shortest path between two vertices.
nebula> MATCH p = shortestPath((a:player{name:\"Tim Duncan\"})-[e*..5]-(b:player{name:\"Tony Parker\"})) \\\n RETURN p;\n+------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})<-[:follow@0 {degree: 95}]-(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#retrieve_with_multiple_match","title":"Retrieve with multiple match","text":"Multiple MATCH
can be used when different patterns have different filtering criteria and return the rows that exactly match the pattern.
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n MATCH (n)-[]->(l) WHERE id(n)==\"player125\" \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#retrieve_with_optional_match","title":"Retrieve with optional match","text":"See OPTIONAL MATCH.
Caution
In NebulaGraph, the performance and resource usage of the MATCH
statement have been optimized. But we still recommend to use GO
, LOOKUP
, |
, and FETCH
instead of MATCH
when high performance is required.
The GO
statement is used in the NebulaGraph database to traverse the graph starting from a given starting vertex with specified filters and return results.
This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#syntax","title":"Syntax","text":"GO [[<M> TO] <N> {STEP|STEPS}] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions>\u00a0]\nYIELD\u00a0[DISTINCT] <return_list>\n[{SAMPLE <sample_list> | <limit_by_list_clause>}]\n[| GROUP BY {col_name | expr | position} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset>,] <number_rows>];\n\n<vertex_list> ::=\n <vid> [, <vid> ...]\n\n<edge_type_list> ::=\n edge_type [, edge_type ...]\n | *\n\n<return_list> ::=\n <col_name> [AS <col_alias>] [, <col_name> [AS <col_alias>] ...]\n
<N> {STEP|STEPS}
: specifies the hop number. If not specified, the default value for N
is one
. When N
is zero
, NebulaGraph does not traverse any edges and returns nothing.
Note
The path type of the GO
statement is walk
, which means both vertices and edges can be repeatedly visited in graph traversal. For more information, see Path.
M TO N {STEP|STEPS}
: traverses from M to N
hops. When M
is zero
, the output is the same as that of M
is one
. That is, the output of GO 0 TO 2
and GO 1 TO 2
are the same.<vertex_list>
: represents a list of vertex IDs separated by commas.<edge_type_list>
: represents a list of edge types which the traversal can go through.REVERSELY | BIDIRECT
: defines the direction of the query. By default, the GO
statement searches for outgoing edges of <vertex_list>
. If REVERSELY
is set, GO
searches for incoming edges. If BIDIRECT
is set, GO
searches for edges of both directions. The direction of the query can be checked by returning the <edge_type>._type
field using YIELD
. A positive value indicates an outgoing edge, while a negative value indicates an incoming edge.WHERE <expression>
: specifies the traversal filters. You can use the WHERE
clause for the source vertices, the edges, and the destination vertices. You can use it together with AND
, OR
, NOT
, and XOR
. For more information, see WHERE.
Note
WHERE
clause when you traverse along with multiple edge types. For example, WHERE edge1.prop1 > edge2.prop2
is not supported.YIELD [DISTINCT] <return_list>
: defines the output to be returned. It is recommended to use the Schema-related functions to fill in <return_list>
. src(edge)
, dst(edge)
, type(edge) )
, rank(edge)
, etc., are currently supported, while nested functions are not. For more information, see YIELD.SAMPLE <sample_list>
: takes samples from the result set. For more information, see SAMPLE.<limit_by_list_clause>
: limits the number of outputs during the traversal process. For more information, see LIMIT.GROUP BY
: groups the output into subgroups based on the value of the specified property. For more information, see GROUP BY. After grouping, you need to use YIELD
again to define the output that needs to be returned.ORDER BY
: sorts outputs with specified orders. For more information, see ORDER BY.
Note
When the sorting method is not specified, the output orders can be different for the same query.
LIMIT [<offset>,] <number_rows>]
: limits the number of rows of the output. For more information, see LIMIT.WHERE
and YIELD
clauses in GO
statements usually utilize property reference symbols ($^
and $$
) or the properties($^)
and properties($$)
functions to specify the properties of a vertex; use the properties(edge)
function to specify the properties of an edge. For details, see Property Reference Symbols and Schema-related Functions.GO
statement, you need to set a name for the result and pass it to the next subquery using the pipe symbol |
, and reference the name of the result in the next subquery using $-
. See the Pipe Operator for details.NULL
.For example, to query the team that a person belongs to, assuming that the person is connected to the team by the serve
edge and the person's ID is player102
.
nebula>\u00a0GO FROM \"player102\" OVER serve YIELD dst(edge);\n+-----------+\n| dst(EDGE) |\n+-----------+\n| \"team203\" |\n| \"team204\" |\n+-----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_all_vertices_within_a_specified_number_of_hops_from_a_starting_vertex","title":"To query all vertices within a specified number of hops from a starting vertex","text":"For example, to query all vertices within two hops of a person vertex, assuming that the person is connected to other people by the follow
edge and the person's ID is player102
.
# Return all vertices that are 2 hops away from the player102 vertex.\nnebula> GO 2 STEPS FROM \"player102\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n+-------------+\n
# Return all vertices within 1 or 2 hops away from the player102 vertex.\nnebula> GO 1 TO 2 STEPS FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v) -[e:follow*1..2]->(v2) \\\n WHERE id(v) == \"player100\" \\\n RETURN id(v2) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_add_filtering_conditions","title":"To add filtering conditions","text":"Case: To query the vertices and edges that meet specific conditions.
For example, use the WHERE
clause to query the edges with specific properties between the starting vertex and the destination vertex.
nebula>\u00a0GO FROM \"player100\", \"player102\" OVER serve \\\n WHERE properties(edge).start_year > 1995 \\\n YIELD DISTINCT properties($$).name AS team_name, properties(edge).start_year AS start_year, properties($^).name AS player_name;\n\n+-----------------+------------+---------------------+\n| team_name | start_year | player_name |\n+-----------------+------------+---------------------+\n| \"Spurs\" | 1997 | \"Tim Duncan\" |\n| \"Trail Blazers\" | 2006 | \"LaMarcus Aldridge\" |\n| \"Spurs\" | 2015 | \"LaMarcus Aldridge\" |\n+-----------------+------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_all_edges","title":"To query all edges","text":"Case: To query all edges that are connected to the starting vertex.
# Return all edges that are connected to the player102 vertex.\nnebula> GO FROM \"player102\" OVER * BIDIRECT YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player103\"->\"player102\" @0 {degree: 70}] |\n| [:follow \"player135\"->\"player102\" @0 {degree: 80}] |\n| [:follow \"player102\"->\"player100\" @0 {degree: 75}] |\n| [:follow \"player102\"->\"player101\" @0 {degree: 75}] |\n| [:serve \"player102\"->\"team203\" @0 {end_year: 2015, start_year: 2006}] |\n| [:serve \"player102\"->\"team204\" @0 {end_year: 2019, start_year: 2015}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_multiple_edge_types","title":"To query multiple edge types","text":"Case: To query multiple edge types that are connected to the starting vertex. You can specify multiple edge types or the *
symbol to query multiple edge types.
For example, to query the follow
and serve
edges that are connected to the starting vertex.
nebula> GO FROM \"player100\" OVER follow, serve \\\n YIELD properties(edge).degree, properties(edge).start_year;\n+-------------------------+-----------------------------+\n| properties(EDGE).degree | properties(EDGE).start_year |\n+-------------------------+-----------------------------+\n| 95 | __NULL__ |\n| 95 | __NULL__ |\n| __NULL__ | 1997 |\n+-------------------------+-----------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_incoming_vertices_using_the_reversely_keyword","title":"To query incoming vertices using the REVERSELY keyword","text":"# Return the vertices that follow the player100 vertex.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD src(edge) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player102\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v)<-[e:follow]- (v2) WHERE id(v) == 'player100' \\\n RETURN id(v2) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_subqueries_as_the_starting_vertice_of_a_graph_traversal","title":"To use subqueries as the starting vertice of a graph traversal","text":"# Return the friends of the player100 vertex and the teams that the friends belong to.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD src(edge) AS id | \\\n GO FROM $-.id OVER serve \\\n WHERE properties($^).age > 20 \\\n YIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n+---------------------+-----------------+\n| FriendOf | Team |\n+---------------------+-----------------+\n| \"Boris Diaw\" | \"Spurs\" |\n| \"Boris Diaw\" | \"Jazz\" |\n| \"Boris Diaw\" | \"Suns\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v)<-[e:follow]- (v2)-[e2:serve]->(v3) \\\n WHERE id(v) == 'player100' \\\n RETURN v2.player.name AS FriendOf, v3.team.name AS Team;\n+---------------------+-----------------+\n| FriendOf | Team |\n+---------------------+-----------------+\n| \"Boris Diaw\" | \"Spurs\" |\n| \"Boris Diaw\" | \"Jazz\" |\n| \"Boris Diaw\" | \"Suns\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_group_by_to_group_the_output","title":"To use GROUP BY
to group the output","text":"You need to use YIELD
to define the output that needs to be returned after grouping.
# The following example collects the outputs according to age.\nnebula> GO 2 STEPS FROM \"player100\" OVER follow \\\n YIELD src(edge) AS src, dst(edge) AS dst, properties($$).age AS age \\\n | GROUP BY $-.dst \\\n YIELD $-.dst AS dst, collect_set($-.src) AS src, collect($-.age) AS age;\n+-------------+----------------------------+----------+\n| dst | src | age |\n+-------------+----------------------------+----------+\n| \"player125\" | {\"player101\"} | [41] |\n| \"player100\" | {\"player125\", \"player101\"} | [42, 42] |\n| \"player102\" | {\"player101\"} | [33] |\n+-------------+----------------------------+----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_order_by_and_limit_to_sort_and_limit_the_output","title":"To use ORDER BY
and LIMIT
to sort and limit the output","text":"# The following example groups the outputs and restricts the number of rows of the outputs.\nnebula> $a = GO FROM \"player100\" OVER follow YIELD src(edge) AS src, dst(edge) AS dst; \\\n GO 2 STEPS FROM $a.dst OVER follow \\\n YIELD $a.src AS src, $a.dst, src(edge), dst(edge) \\\n | ORDER BY $-.src | OFFSET 1 LIMIT 2;\n+-------------+-------------+-------------+-------------+\n| src | $a.dst | src(EDGE) | dst(EDGE) |\n+-------------+-------------+-------------+-------------+\n| \"player100\" | \"player101\" | \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" | \"player100\" | \"player125\" |\n+-------------+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#other_examples","title":"Other examples","text":"# The following example determines if $$.player.name IS NOT EMPTY.\nnebula> GO FROM \"player100\" OVER follow WHERE properties($$).name IS NOT EMPTY YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player125\" |\n| \"player101\" |\n+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/","title":"FETCH","text":"The FETCH
statement retrieves the properties of the specified vertices or edges.
This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties","title":"Fetch vertex properties","text":""},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#syntax","title":"Syntax","text":"FETCH PROP ON {<tag_name>[, tag_name ...] | *}\n<vid> [, vid ...]\nYIELD [DISTINCT] <return_list> [AS <alias>];\n
Parameter Description tag_name
The name of the tag. *
Represents all the tags in the current graph space. vid
The vertex ID. YIELD
Define the output to be returned. For details, see YIELD
. AS
Set an alias."},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_one_tag","title":"Fetch vertex properties by one tag","text":"Specify a tag in the FETCH
statement to fetch the vertex properties by that tag.
nebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+-------------------------------+\n| properties(VERTEX) |\n+-------------------------------+\n| {age: 42, name: \"Tim Duncan\"} |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_specific_properties_of_a_vertex","title":"Fetch specific properties of a vertex","text":"Use a YIELD
clause to specify the properties to be returned.
nebula> FETCH PROP ON player \"player100\" \\\n YIELD properties(vertex).name AS name;\n+--------------+\n| name |\n+--------------+\n| \"Tim Duncan\" |\n+--------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_of_multiple_vertices","title":"Fetch properties of multiple vertices","text":"Specify multiple VIDs (vertex IDs) to fetch properties of multiple vertices. Separate the VIDs with commas.
nebula> FETCH PROP ON player \"player101\", \"player102\", \"player103\" YIELD properties(vertex);\n+--------------------------------------+\n| properties(VERTEX) |\n+--------------------------------------+\n| {age: 33, name: \"LaMarcus Aldridge\"} |\n| {age: 36, name: \"Tony Parker\"} |\n| {age: 32, name: \"Rudy Gay\"} |\n+--------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_multiple_tags","title":"Fetch vertex properties by multiple tags","text":"Specify multiple tags in the FETCH
statement to fetch the vertex properties by the tags. Separate the tags with commas.
# The following example creates a new tag t1.\nnebula> CREATE TAG IF NOT EXISTS t1(a string, b int);\n\n# The following example attaches t1 to the vertex \"player100\".\nnebula> INSERT VERTEX t1(a, b) VALUES \"player100\":(\"Hello\", 100);\n\n# The following example fetches the properties of vertex \"player100\" by the tags player and t1.\nnebula> FETCH PROP ON player, t1 \"player100\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n+----------------------------------------------------------------------------+\n
You can combine multiple tags with multiple VIDs in a FETCH
statement.
nebula> FETCH PROP ON player, t1 \"player100\", \"player103\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n| (\"player103\" :player{age: 32, name: \"Rudy Gay\"}) |\n+----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_all_tags","title":"Fetch vertex properties by all tags","text":"Set an asterisk symbol *
to fetch properties by all tags in the current graph space.
nebula> FETCH PROP ON * \"player100\", \"player106\", \"team200\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n| (\"player106\" :player{age: 25, name: \"Kyle Anderson\"}) |\n| (\"team200\" :team{name: \"Warriors\"}) |\n+----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_edge_properties","title":"Fetch edge properties","text":""},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#syntax_1","title":"Syntax","text":"FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
Parameter Description edge_type
The name of the edge type. src_vid
The VID of the source vertex. It specifies the start of an edge. dst_vid
The VID of the destination vertex. It specifies the end of an edge. rank
The rank of the edge. It is optional and defaults to 0
. It distinguishes an edge from other edges with the same edge type, source vertex, destination vertex, and rank. YIELD
Define the output to be returned. For details, see YIELD
."},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_all_properties_of_an_edge","title":"Fetch all properties of an edge","text":"The following statement fetches all the properties of the serve
edge that connects vertex \"player100\"
and vertex \"team204\"
.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\" YIELD properties(edge);\n+------------------------------------+\n| properties(EDGE) |\n+------------------------------------+\n| {end_year: 2016, start_year: 1997} |\n+------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_specific_properties_of_an_edge","title":"Fetch specific properties of an edge","text":"Use a YIELD
clause to fetch specific properties of an edge.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\" \\\n YIELD properties(edge).start_year;\n+-----------------------------+\n| properties(EDGE).start_year |\n+-----------------------------+\n| 1997 |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_of_multiple_edges","title":"Fetch properties of multiple edges","text":"Specify multiple edge patterns (<src_vid> -> <dst_vid>[@<rank>]
) to fetch properties of multiple edges. Separate the edge patterns with commas.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\", \"player133\" -> \"team202\" YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n| [:serve \"player133\"->\"team202\" @0 {end_year: 2011, start_year: 2002}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_based_on_edge_rank","title":"Fetch properties based on edge rank","text":"If there are multiple edges with the same edge type, source vertex, and destination vertex, you can specify the rank to fetch the properties on the correct edge.
# The following example inserts edges with different ranks and property values.\nnebula> insert edge serve(start_year,end_year) \\\n values \"player100\"->\"team204\"@1:(1998, 2017);\n\nnebula> insert edge serve(start_year,end_year) \\\n values \"player100\"->\"team204\"@2:(1990, 2018);\n\n# By default, the FETCH statement returns the edge whose rank is 0.\nnebula> FETCH PROP ON serve \"player100\" -> \"team204\" YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n+-----------------------------------------------------------------------+\n\n# To fetch on an edge whose rank is not 0, set its rank in the FETCH statement.\nnebula> FETCH PROP ON serve \"player100\" -> \"team204\"@1 YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @1 {end_year: 2017, start_year: 1998}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#use_fetch_in_composite_queries","title":"Use FETCH in composite queries","text":"A common way to use FETCH
is to combine it with native nGQL such as GO
.
The following statement returns the degree
values of the follow
edges that start from vertex \"player101\"
.
nebula> GO FROM \"player101\" OVER follow \\\n YIELD src(edge) AS s, dst(edge) AS d \\\n | FETCH PROP ON follow $-.s -> $-.d \\\n YIELD properties(edge).degree;\n+-------------------------+\n| properties(EDGE).degree |\n+-------------------------+\n| 95 |\n| 90 |\n| 95 |\n+-------------------------+\n
Or you can use user-defined variables to construct similar queries.
nebula> $var = GO FROM \"player101\" OVER follow \\\n YIELD src(edge) AS s, dst(edge) AS d; \\\n FETCH PROP ON follow $var.s -> $var.d \\\n YIELD properties(edge).degree;\n+-------------------------+\n| properties(EDGE).degree |\n+-------------------------+\n| 95 |\n| 90 |\n| 95 |\n+-------------------------+\n
For more information about composite queries, see Composite queries (clause structure).
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/","title":"LOOKUP","text":"The LOOKUP
statement traverses data based on indexes. You can use LOOKUP
for the following purposes:
WHERE
clause.This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#precautions","title":"Precautions","text":"If the specified property is not indexed when using the LOOKUP
statement, NebulaGraph randomly selects one of the available indexes.
For example, the tag player
has two properties, name
and age
. Both the tag player
itself and the property name
have indexes, but the property age
has no indexes. When running LOOKUP ON player WHERE player.age == 36 YIELD player.name;
, NebulaGraph randomly uses one of the indexes of the tag player
and the property name
. You can use the EXPLAIN
statement to check the selected index.
Legacy version compatibility
Before the release 2.5.0, if the specified property is not indexed when using the LOOKUP
statement, NebulaGraph reports an error and does not use other indexes.
Before using the LOOKUP
statement, make sure that at least one index is created. If there are already related vertices, edges, or properties before an index is created, the user must rebuild the index after creating the index to make it valid.
LOOKUP ON {<vertex_tag> | <edge_type>}\n[WHERE <expression> [AND <expression> ...]]\nYIELD [DISTINCT] <return_list> [AS <alias>];\n\n<return_list>\n <prop_name> [AS <col_alias>] [, <prop_name> [AS <prop_alias>] ...];\n
WHERE <expression>
: filters data with specified conditions. Both AND
and OR
are supported between different expressions. For more information, see WHERE.YIELD
: Define the output to be returned. For details, see YIELD
.DISTINCT
: Aggregate the output results and return the de-duplicated result set.AS
: Set an alias.WHERE
in LOOKUP
","text":"The WHERE
clause in a LOOKUP
statement does not support the following operations:
$-
and $^
.rank()
.tagName.prop1> tagName.prop2
.XOR
operation is not supported.STARTS WITH
are not supported.The following example returns vertices whose name
is Tony Parker
and the tag is player
.
nebula> CREATE TAG INDEX IF NOT EXISTS index_player ON player(name(30), age);\n\nnebula> REBUILD TAG INDEX index_player;\n+------------+\n| New Job Id |\n+------------+\n| 15 |\n+------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Tony Parker\" \\\n YIELD id(vertex);\n+---------------+\n| id(VERTEX) |\n+---------------+\n| \"player101\" |\n+---------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name AS name, properties(vertex).age AS age;\n+---------------+-----+\n| name | age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n\nnebula> LOOKUP ON player \\\n WHERE player.age > 45 \\\n YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player144\" |\n| \"player140\" |\n+-------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name STARTS WITH \"B\" \\\n AND player.age IN [22,30] \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Ben Simmons\" | 22 |\n| \"Blake Griffin\" | 30 |\n+-------------------------+------------------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Kobe Bryant\"\\\n YIELD id(vertex) AS VertexID, properties(vertex).name AS name |\\\n GO FROM $-.VertexID OVER serve \\\n YIELD $-.name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------+-----------------------------+---------------------------+---------------------+\n| $-.name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------+-----------------------------+---------------------------+---------------------+\n| \"Kobe Bryant\" | 1996 | 2016 | \"Lakers\" |\n+---------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#retrieve_edges","title":"Retrieve edges","text":"The following example returns edges whose degree
is 90
and the edge type is follow
.
nebula> CREATE EDGE INDEX IF NOT EXISTS index_follow ON follow(degree);\n\nnebula> REBUILD EDGE INDEX index_follow;\n+------------+\n| New Job Id |\n+------------+\n| 62 |\n+------------+\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 90 YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player109\"->\"player125\" @0 {degree: 90}] |\n| [:follow \"player118\"->\"player120\" @0 {degree: 90}] |\n| [:follow \"player118\"->\"player131\" @0 {degree: 90}] |\n...\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 90 \\\n YIELD properties(edge).degree;\n+-------------+-------------+---------+-------------------------+\n| SrcVID | DstVID | Ranking | properties(EDGE).degree |\n+-------------+-------------+---------+-------------------------+\n| \"player150\" | \"player143\" | 0 | 90 |\n| \"player150\" | \"player137\" | 0 | 90 |\n| \"player148\" | \"player136\" | 0 | 90 |\n...\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 60 \\\n YIELD dst(edge) AS DstVID, properties(edge).degree AS Degree |\\\n GO FROM $-.DstVID OVER serve \\\n YIELD $-.DstVID, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+-------------+-----------------------------+---------------------------+---------------------+\n| $-.DstVID | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+-------------+-----------------------------+---------------------------+---------------------+\n| \"player105\" | 2010 | 2018 | \"Spurs\" |\n| \"player105\" | 2009 | 2010 | \"Cavaliers\" |\n| \"player105\" | 2018 | 2019 | \"Raptors\" |\n+-------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#list_vertices_or_edges_with_a_tag_or_an_edge_type","title":"List vertices or edges with a tag or an edge type","text":"To list vertices or edges with a tag or an edge type, at least one index must exist on the tag, the edge type, or its property.
For example, if there is a player
tag with a name
property and an age
property, to retrieve the VID of all vertices tagged with player
, there has to be an index on the player
tag itself, the name
property, or the age
property.
player
.nebula> CREATE TAG IF NOT EXISTS player(name string,age int);\n\nnebula> CREATE TAG INDEX IF NOT EXISTS player_index on player();\n\nnebula> REBUILD TAG INDEX player_index;\n+------------+\n| New Job Id |\n+------------+\n| 66 |\n+------------+\n\nnebula> INSERT VERTEX player(name,age) \\\n VALUES \"player100\":(\"Tim Duncan\", 42), \"player101\":(\"Tony Parker\", 36);\n\nThe following statement retrieves the VID of all vertices with the tag `player`. It is similar to `MATCH (n:player) RETURN id(n) /*, n */`.\n\nnebula> LOOKUP ON player YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n...\n
follow
edge type.nebula> CREATE EDGE IF NOT EXISTS follow(degree int);\n\nnebula> CREATE EDGE INDEX IF NOT EXISTS follow_index on follow();\n\nnebula> REBUILD EDGE INDEX follow_index;\n+------------+\n| New Job Id |\n+------------+\n| 88 |\n+------------+\n\nnebula> INSERT EDGE follow(degree) \\\n VALUES \"player100\"->\"player101\":(95);\n\nThe following statement retrieves all edges with the edge type `follow`. It is similar to `MATCH (s)-[e:follow]->(d) RETURN id(s), rank(e), id(d) /*, type(e) */`.\n\nnebula)> LOOKUP ON follow YIELD edge AS e;\n+-----------------------------------------------------+\n| e |\n+-----------------------------------------------------+\n| [:follow \"player105\"->\"player100\" @0 {degree: 70}] |\n| [:follow \"player105\"->\"player116\" @0 {degree: 80}] |\n| [:follow \"player109\"->\"player100\" @0 {degree: 80}] |\n...\n
The following example shows how to count the number of vertices tagged with player
and edges of the follow
edge type.
nebula> LOOKUP ON player YIELD id(vertex)|\\\n YIELD COUNT(*) AS Player_Number;\n+---------------+\n| Player_Number |\n+---------------+\n| 51 |\n+---------------+\n\nnebula> LOOKUP ON follow YIELD edge AS e| \\\n YIELD COUNT(*) AS Follow_Number;\n+---------------+\n| Follow_Number |\n+---------------+\n| 81 |\n+---------------+\n
Note
You can also use SHOW STATS
to count the numbers of vertices or edges.
The FIND PATH
statement finds the paths between the selected source vertices and destination vertices.
Note
To improve the query performance with the FIND PATH
statement, you can add the num_operator_threads
parameter in the nebula-graphd.conf
configuration file. The value range of the num_operator_threads
parameter is [2, 10] and make sure that the value is not greater than the number of CPU cores of the machine where the graphd
service is deployed. It is recommended to set the value to the number of CPU cores of the machine where the graphd
service is deployed. For more information about the nebula-graphd.conf
configuration file, see nebula-graphd.conf.
FIND { SHORTEST | SINGLE SHORTEST | ALL | NOLOOP } PATH [WITH PROP] FROM <vertex_id_list> TO <vertex_id_list>\nOVER <edge_type_list> [REVERSELY | BIDIRECT] \n[<WHERE clause>] [UPTO <N> {STEP|STEPS}] \nYIELD path as <alias>\n[| ORDER BY $-.path] [| LIMIT <M>];\n\n<vertex_id_list> ::=\n [vertex_id [, vertex_id] ...]\n
SHORTEST
finds all the shortest path.ALL
finds all the paths.NOLOOP
finds the paths without circles.WITH PROP
shows properties of vertices and edges. If not specified, properties will be hidden.<vertex_id_list>
is a list of vertex IDs separated with commas (,). It supports $-
and $var
.<edge_type_list>
is a list of edge types separated with commas (,). *
is all edge types.REVERSELY | BIDIRECT
specifies the direction. REVERSELY
is reverse graph traversal while BIDIRECT
is bidirectional graph traversal.<WHERE clause>
filters properties of edges.UPTO <N> {STEP|STEPS}
is the maximum hop number of the path. The default value is 5
.ORDER BY $-.path
specifies the order of the returned paths. For information about the order rules, see Path.LIMIT <M>
specifies the maximum number of rows to return.Note
The path type of FIND PATH
is trail
. Only vertices can be repeatedly visited in graph traversal. For more information, see Path.
FIND PATH
only supports filtering properties of edges with WHERE
clauses. Filtering properties of vertices and functions are not supported for now.FIND PATH
is a single-thread procedure, so it uses much memory.A returned path is like (<vertex_id>)-[:<edge_type_name>@<rank>]->(<vertex_id)
.
nebula> FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path AS p;\n+--------------------------------------------+\n| p |\n+--------------------------------------------+\n| <(\"player102\")-[:serve@0 {}]->(\"team204\")> |\n+--------------------------------------------+\n
nebula> FIND SHORTEST PATH WITH PROP FROM \"team204\" TO \"player100\" OVER * REVERSELY YIELD path AS p;\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"team204\" :team{name: \"Spurs\"})<-[:serve@0 {end_year: 2016, start_year: 1997}]-(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n
nebula> FIND SHORTEST PATH FROM \"player100\", \"player130\" TO \"player132\", \"player133\" OVER * BIDIRECT UPTO 18 STEPS YIELD path as p;\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\")<-[:follow@0 {}]-(\"player144\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player112\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player109\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player104\")-[:serve@20182019 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| ... |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player112\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player109\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| ... |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
nebula> FIND ALL PATH FROM \"player100\" TO \"team204\" OVER * WHERE follow.degree is EMPTY or follow.degree >=0 YIELD path AS p;\n+------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------+\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:serve@0 {}]->(\"team204\")> |\n|... |\n+------------------------------------------------------------------------------+\n
nebula> FIND NOLOOP PATH FROM \"player100\" TO \"team204\" OVER * YIELD path AS p;\n+--------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------+\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player102\")-[:serve@0 {}]->(\"team204\")> |\n+--------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.find-path/#faq","title":"FAQ","text":""},{"location":"3.ngql-guide/7.general-query-statements/6.find-path/#does_it_support_the_where_clause_to_achieve_conditional_filtering_during_graph_traversal","title":"Does it support the WHERE clause to achieve conditional filtering during graph traversal?","text":"FIND PATH
only supports filtering properties of edges with WHERE
clauses, such as WHERE follow.degree is EMPTY or follow.degree >=0
.
Filtering properties of vertices is not supported for now.
"},{"location":"3.ngql-guide/7.general-query-statements/7.get-subgraph/","title":"GET SUBGRAPH","text":"The GET SUBGRAPH
statement returns a subgraph that is generated by traversing a graph starting from a specified vertex. GET SUBGRAPH
statements allow you to specify the number of steps and the type or direction of edges during the traversal.
GET SUBGRAPH [WITH PROP] [<step_count> {STEP|STEPS}] FROM {<vid>, <vid>...}\n[{IN | OUT | BOTH} <edge_type>, <edge_type>...]\n[WHERE <expression> [AND <expression> ...]]\nYIELD {[VERTICES AS <vertex_alias>] [,EDGES AS <edge_alias>]};\n
WITH PROP
shows the properties. If not specified, the properties will be hidden.step_count
specifies the number of hops from the source vertices and returns the subgraph from 0 to step_count
hops. It must be a non-negative integer. Its default value is 1.vid
specifies the vertex IDs. edge_type
specifies the edge type. You can use IN
, OUT
, and BOTH
to specify the traversal direction of the edge type. The default is BOTH
.<WHERE clause>
specifies the filter conditions for the traversal, which can be used with the boolean operator AND
.YIELD
defines the output that needs to be returned. You can return only vertices or edges. A column alias must be set.Note
The path type of GET SUBGRAPH
is trail
. Only vertices can be repeatedly visited in graph traversal. For more information, see Path.
While using the WHERE
clause in a GET SUBGRAPH
statement, note the following restrictions:
AND
operator.$$.tagName.propName
.edge_type.propName
.The following graph is used as the sample.
Insert the test data:
nebula> CREATE SPACE IF NOT EXISTS subgraph(partition_num=15, replica_factor=1, vid_type=fixed_string(30));\nnebula> USE subgraph;\nnebula> CREATE TAG IF NOT EXISTS player(name string, age int);\nnebula> CREATE TAG IF NOT EXISTS team(name string);\nnebula> CREATE EDGE IF NOT EXISTS follow(degree int);\nnebula> CREATE EDGE IF NOT EXISTS serve(start_year int, end_year int);\nnebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\nnebula> INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36);\nnebula> INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\nnebula> INSERT VERTEX team(name) VALUES \"team203\":(\"Trail Blazers\"), \"team204\":(\"Spurs\");\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player100\":(95);\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player102\":(90);\nnebula> INSERT EDGE follow(degree) VALUES \"player102\" -> \"player100\":(75);\nnebula> INSERT EDGE serve(start_year, end_year) VALUES \"player101\" -> \"team204\":(1999, 2018),\"player102\" -> \"team203\":(2006, 2015);\n
player101
over all edge types and gets the subgraph.nebula> GET SUBGRAPH 1 STEPS FROM \"player101\" YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n| [(\"player101\" :player{})] | [[:serve \"player101\"->\"team204\" @0 {}], [:follow \"player101\"->\"player100\" @0 {}], [:follow \"player101\"->\"player102\" @0 {}]] |\n| [(\"team204\" :team{}), (\"player100\" :player{}), (\"player102\" :player{})] | [[:follow \"player102\"->\"player100\" @0 {}]] |\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n
The returned subgraph is as follows.
player101
over incoming follow
edges and gets the subgraph.nebula> GET SUBGRAPH 1 STEPS FROM \"player101\" IN follow YIELD VERTICES AS nodes, EDGES AS relationships;\n+---------------------------+---------------+\n| nodes | relationships |\n+---------------------------+---------------+\n| [(\"player101\" :player{})] | [] |\n+---------------------------+---------------+\n
There is no incoming follow
edge to player101
, so only the vertex player101
is returned.
player101
over outgoing serve
edges, gets the subgraph, and shows the property of the edge.nebula> GET SUBGRAPH WITH PROP 1 STEPS FROM \"player101\" OUT serve YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n| [(\"player101\" :player{age: 36, name: \"Tony Parker\"})] | [[:serve \"player101\"->\"team204\" @0 {end_year: 2018, start_year: 1999}]] |\n| [(\"team204\" :team{name: \"Spurs\"})] | [] |\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n
The returned subgraph is as follows.
player101
over follow
edges, filters by degree > 90 and age > 30, and shows the properties of edges.nebula> GET SUBGRAPH WITH PROP 2 STEPS FROM \"player101\" \\\n WHERE follow.degree > 90 AND $$.player.age > 30 \\\n YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------+------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------+------------------------------------------------------+\n| [(\"player101\" :player{age: 36, name: \"Tony Parker\"})] | [[:follow \"player101\"->\"player100\" @0 {degree: 95}]] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"})] | [] |\n+-------------------------------------------------------+------------------------------------------------------+\n
step_count
?","text":"To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions. The following graph is used as the sample.
GET SUBGRAPH 1 STEPS FROM \"A\";
are A->B
, B->A
, and A->C
. To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions, namely B->C
.GET SUBGRAPH 1 STEPS FROM \"A\" IN follow;
is B->A
. To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions, namely A->B
.If you only query paths or vertices that meet the conditions, we suggest you use MATCH or GO. The example is as follows.
nebula> MATCH p= (v:player) -- (v2) WHERE id(v)==\"A\" RETURN p;\nnebula> GO 1 STEPS FROM \"A\" OVER follow YIELD src(edge),dst(edge);\n
"},{"location":"3.ngql-guide/7.general-query-statements/7.get-subgraph/#why_is_the_number_of_hops_in_the_returned_result_lower_than_step_count","title":"Why is the number of hops in the returned result lower than step_count
?","text":"The query stops when there is not enough subgraph data and will not return the null value.
nebula> GET SUBGRAPH 100 STEPS FROM \"player101\" OUT follow YIELD VERTICES AS nodes, EDGES AS relationships;\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n| nodes | relationships |\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n| [(\"player101\" :player{})] | [[:follow \"player101\"->\"player100\" @0 {}], [:follow \"player101\"->\"player102\" @0 {}]] |\n| [(\"player100\" :player{}), (\"player102\" :player{})] | [[:follow \"player102\"->\"player100\" @0 {}]] |\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/optional-match/","title":"OPTIONAL MATCH","text":"Caution
The feature is still in beta. It will continue to be optimized.
The OPTIONAL MATCH
clause is used to search for the pattern described in it. OPTIONAL MATCH
matches patterns against your graph database, just like MATCH
does. The difference is that if no matches are found, OPTIONAL MATCH
will use a null for missing parts of the pattern.
This topic applies to the openCypher syntax in nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/optional-match/#limitations","title":"Limitations","text":"The WHERE
clause cannot be used in an OPTIONAL MATCH
clause.
The example of the use of OPTIONAL MATCH
in the MATCH
statement is as follows:
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n OPTIONAL MATCH (n)-[]->(l) \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"team204\" | __NULL__ |\n| \"player100\" | \"player101\" | \"team204\" |\n| \"player100\" | \"player101\" | \"team215\" |\n| \"player100\" | \"player101\" | \"player100\" |\n| \"player100\" | \"player101\" | \"player102\" |\n| \"player100\" | \"player101\" | \"player125\" |\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
Using multiple MATCH
instead of OPTIONAL MATCH
returns rows that match the pattern exactly. The example is as follows:
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n MATCH (n)-[]->(l) \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"player101\" | \"team204\" |\n| \"player100\" | \"player101\" | \"team215\" |\n| \"player100\" | \"player101\" | \"player100\" |\n| \"player100\" | \"player101\" | \"player102\" |\n| \"player100\" | \"player101\" | \"player125\" |\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/1.show-charset/","title":"SHOW CHARSET","text":"The SHOW CHARSET
statement shows the available character sets.
Currently available types are utf8
and utf8mb4
. The default charset type is utf8
. NebulaGraph extends the uft8
to support four-byte characters. Therefore utf8
and utf8mb4
are equivalent.
SHOW CHARSET;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/1.show-charset/#example","title":"Example","text":"nebula> SHOW CHARSET;\n+---------+-----------------+-------------------+--------+\n| Charset | Description | Default collation | Maxlen |\n+---------+-----------------+-------------------+--------+\n| \"utf8\" | \"UTF-8 Unicode\" | \"utf8_bin\" | 4 |\n+---------+-----------------+-------------------+--------+\n
Parameter Description Charset
The name of the character set. Description
The description of the character set. Default collation
The default collation of the character set. Maxlen
The maximum number of bytes required to store one character."},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/","title":"SHOW ROLES","text":"The SHOW ROLES
statement shows the roles that are assigned to a user account.
The return message differs according to the role of the user who is running this statement:
GOD
or ADMIN
and is granted access to the specified graph space, NebulaGraph shows all roles in this graph space except for GOD
.DBA
, USER
, or GUEST
and is granted access to the specified graph space, NebulaGraph shows the user's own role in this graph space.PermissionError
.For more information about roles, see Roles and privileges.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/#syntax","title":"Syntax","text":"SHOW ROLES IN <space_name>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/#example","title":"Example","text":"nebula> SHOW ROLES in basketballplayer;\n+---------+-----------+\n| Account | Role Type |\n+---------+-----------+\n| \"user1\" | \"ADMIN\" |\n+---------+-----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/","title":"SHOW SNAPSHOTS","text":"The SHOW SNAPSHOTS
statement shows the information of all the snapshots.
For how to create a snapshot and backup data, see Snapshot.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/#role_requirement","title":"Role requirement","text":"Only the root
user who has the GOD
role can use the SHOW SNAPSHOTS
statement.
SHOW SNAPSHOTS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/#example","title":"Example","text":"nebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+-----------------------------------------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+-----------------------------------------------------+\n| \"SNAPSHOT_2020_12_16_11_13_55\" | \"VALID\" | \"storaged0:9779, storaged1:9779, storaged2:9779\" |\n| \"SNAPSHOT_2020_12_16_11_14_10\" | \"VALID\" | \"storaged0:9779, storaged1:9779, storaged2:9779\" |\n+--------------------------------+---------+-----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/","title":"SHOW SPACES","text":"The SHOW SPACES
statement shows existing graph spaces in NebulaGraph.
For how to create a graph space, see CREATE SPACE.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/#syntax","title":"Syntax","text":"SHOW SPACES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/#example","title":"Example","text":"nebula> SHOW SPACES;\n+---------------------+\n| Name |\n+---------------------+\n| \"docs\" |\n| \"basketballplayer\" |\n+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/14.show-stats/","title":"SHOW STATS","text":"The SHOW STATS
statement shows the statistics of the graph space collected by the latest SUBMIT JOB STATS
job.
The statistics include the following information:
Warning
The data returned by SHOW STATS
is not real-time. The returned data is collected by the latest SUBMIT JOB STATS job and may include TTL-expired data. The expired data will be deleted and not included in the statistics the next time the Compaction operation is performed.
You have to run the SUBMIT JOB STATS
statement in the graph space where you want to collect statistics. For more information, see SUBMIT JOB STATS.
Caution
The result of the SHOW STATS
statement is based on the last executed SUBMIT JOB STATS
statement. If you want to update the result, run SUBMIT JOB STATS
again. Otherwise the statistics will be wrong.
SHOW STATS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/14.show-stats/#examples","title":"Examples","text":"# Choose a graph space.\nnebula> USE basketballplayer;\n\n# Start SUBMIT JOB STATS.\nnebula> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 98 |\n+------------+\n\n# Make sure the job executes successfully.\nnebula> SHOW JOB 98;\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n| 98 | \"STATS\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 0 | \"storaged2\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 1 | \"storaged0\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 2 | \"storaged1\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| \"Total:3\" | \"Succeeded:3\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n\n# Show the statistics of the graph space.\nnebula> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 51 |\n| \"Tag\" | \"team\" | 30 |\n| \"Edge\" | \"follow\" | 81 |\n| \"Edge\" | \"serve\" | 152 |\n| \"Space\" | \"vertices\" | 81 |\n| \"Space\" | \"edges\" | 233 |\n+---------+------------+-------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/15.show-tags-edges/","title":"SHOW TAGS/EDGES","text":"The SHOW TAGS
statement shows all the tags in the current graph space.
The SHOW EDGES
statement shows all the edge types in the current graph space.
SHOW {TAGS | EDGES};\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/15.show-tags-edges/#examples","title":"Examples","text":"nebula> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"star\" |\n| \"team\" |\n+----------+\n\nnebula> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/16.show-users/","title":"SHOW USERS","text":"The SHOW USERS
statement shows the user information.
Only the root
user who has the GOD
role can use the SHOW USERS
statement.
SHOW USERS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/16.show-users/#example","title":"Example","text":"nebula> SHOW USERS;\n+---------+-----------------+\n| Account | IP Whitelist |\n+---------+-----------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n| \"user2\" | \"192.168.10.10\" |\n+---------+-----------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/17.show-sessions/","title":"SHOW SESSIONS","text":"When a user logs in to the database, a corresponding session will be created and users can query for session information.
The SHOW SESSIONS
statement shows the information of all the sessions. It can also show a specified session with its ID.
release
to release the session and clear the session information when you run exit
after the operation ends. If you exit the database in an unexpected way and the session timeout duration is not set via session_idle_timeout_secs
in nebula-graphd.conf, the session will not be released automatically. For those sessions that are not automatically released, you need to delete them manually. For details, see KILL SESSIONS.SHOW SESSIONS
queries the session information of all the Graph services.SHOW LOCAL SESSIONS
queries the session information of the currently connected Graph service and does not query the session information of other Graph services.SHOW SESSION <Session_Id>
queries the session information with a specific session id.SHOW [LOCAL] SESSIONS;\nSHOW SESSION <Session_Id>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/17.show-sessions/#examples","title":"Examples","text":"nebula> SHOW SESSIONS;\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| SessionId | UserName | SpaceName | CreateTime | UpdateTime | GraphAddr | Timezone | ClientIp |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| 1651220858102296 | \"root\" | \"basketballplayer\" | 2022-04-29T08:27:38.102296 | 2022-04-29T08:50:46.282921 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651199330300991 | \"root\" | \"basketballplayer\" | 2022-04-29T02:28:50.300991 | 2022-04-29T08:16:28.339038 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651112899847744 | \"root\" | \"basketballplayer\" | 2022-04-28T02:28:19.847744 | 2022-04-28T08:17:44.470210 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651041092662100 | \"root\" | \"basketballplayer\" | 2022-04-27T06:31:32.662100 | 2022-04-27T07:01:25.200978 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1650959429593975 | \"root\" | \"basketballplayer\" | 2022-04-26T07:50:29.593975 | 2022-04-26T07:51:47.184810 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1650958897679595 | \"root\" | \"\" | 2022-04-26T07:41:37.679595 | 2022-04-26T07:41:37.683802 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n\nnebula> SHOW SESSION 1635254859271703;\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| SessionId | UserName | SpaceName | CreateTime | UpdateTime | GraphAddr | Timezone | ClientIp |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| 1651220858102296 | \"root\" | \"basketballplayer\" | 2022-04-29T08:27:38.102296 | 2022-04-29T08:50:54.254384 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n
Parameter Description SessionId
The session ID, namely the identifier of a session. UserName
The username in a session. SpaceName
The name of the graph space that the user uses currently. It is null (\"\"
) when you first log in because there is no specified graph space. CreateTime
The time when the session is created, namely the time when the user logs in. The time zone is specified by timezone_name
in the configuration file. UpdateTime
The system will update the time when there is an operation. The time zone is specified by timezone_name
in the configuration file. GraphAddr
The IP (or hostname) and port of the Graph server that hosts the session. Timezone
A reserved parameter that has no specified meaning for now. ClientIp
The IP or hostname of the client."},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/","title":"SHOW QUERIES","text":"The SHOW QUERIES
statement shows the information of working queries in the current session.
Note
To terminate queries, see Kill Query.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/#precautions","title":"Precautions","text":"SHOW LOCAL QUERIES
statement gets the status of queries in the current session from the local cache with almost no latency.SHOW QUERIES
statement gets the information of queries in all the sessions from the Meta Service. The information will be synchronized to the Meta Service according to the interval defined by session_reclaim_interval_secs
. Therefore the information that you get from the client may belong to the last synchronization interval.SHOW [LOCAL] QUERIES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/#examples","title":"Examples","text":"nebula> SHOW LOCAL QUERIES;\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n| 1625463842921750 | 46 | \"root\" | \"\"192.168.x.x\":9669\" | 2021-07-05T05:44:19.502903 | 0 | \"RUNNING\" | \"SHOW LOCAL QUERIES;\" |\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n\nnebula> SHOW QUERIES;\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n| 1625456037718757 | 54 | \"user1\" | \"\"192.168.x.x\":9669\" | 2021-07-05T05:51:08.691318 | 1504502 | \"RUNNING\" | \"MATCH p=(v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n\n# The following statement returns the top 10 queries that have the longest duration.\nnebula> SHOW QUERIES | ORDER BY $-.DurationInUSec DESC | LIMIT 10;\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n| 1625471375320831 | 98 | \"user2\" | \"\"192.168.x.x\":9669\" | 2021-07-05T07:50:24.461779 | 2608176 | \"RUNNING\" | \"MATCH (v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n| 1625456037718757 | 99 | \"user1\" | \"\"192.168.x.x\":9669\" | 2021-07-05T07:50:24.910616 | 2159333 | \"RUNNING\" | \"MATCH (v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n
The descriptions are as follows.
Parameter DescriptionSessionID
The session ID. ExecutionPlanID
The ID of the execution plan. User
The username that executes the query. Host
The IP address and port of the Graph server that hosts the session. StartTime
The time when the query starts. DurationInUSec
The duration of the query. The unit is microsecond. Status
The current status of the query. Query
The query statement."},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/","title":"SHOW META LEADER","text":"The SHOW META LEADER
statement shows the information of the leader in the current Meta cluster.
For more information about the Meta service, see Meta service.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/#syntax","title":"Syntax","text":"SHOW META LEADER;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/#example","title":"Example","text":"nebula> SHOW META LEADER;\n+------------------+---------------------------+\n| Meta Leader | secs from last heart beat |\n+------------------+---------------------------+\n| \"127.0.0.1:9559\" | 3 |\n+------------------+---------------------------+\n
Parameter Description Meta Leader
Shows the information of the leader in the Meta cluster, including the IP (or hostname) and port of the server where the leader is located. secs from last heart beat
Indicates the time interval since the last heartbeat. This parameter is measured in seconds."},{"location":"3.ngql-guide/7.general-query-statements/6.show/2.show-collation/","title":"SHOW COLLATION","text":"The SHOW COLLATION
statement shows the collations supported by NebulaGraph.
Currently available types are: utf8_bin
and utf8mb4_bin
.
utf8
, the default collate is utf8_bin
.utf8mb4
, the default collate is utf8mb4_bin
.SHOW COLLATION;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/2.show-collation/#example","title":"Example","text":"nebula> SHOW COLLATION;\n+------------+---------+\n| Collation | Charset |\n+------------+---------+\n| \"utf8_bin\" | \"utf8\" |\n+------------+---------+\n
Parameter Description Collation
The name of the collation. Charset
The name of the character set with which the collation is associated."},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/","title":"SHOW CREATE SPACE","text":"The SHOW CREATE SPACE
statement shows the creating statement of the specified graph space.
For details about the graph space information, see CREATE SPACE.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/#syntax","title":"Syntax","text":"SHOW CREATE SPACE <space_name>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/#example","title":"Example","text":"nebula> SHOW CREATE SPACE basketballplayer;\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n| Space | Create Space |\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n| \"basketballplayer\" | \"CREATE SPACE `basketballplayer` (partition_num = 10, replica_factor = 1, charset = utf8, collate = utf8_bin, vid_type = FIXED_STRING(32))\" |\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/5.show-create-tag-edge/","title":"SHOW CREATE TAG/EDGE","text":"The SHOW CREATE TAG
statement shows the basic information of the specified tag. For details about the tag, see CREATE TAG.
The SHOW CREATE EDGE
statement shows the basic information of the specified edge type. For details about the edge type, see CREATE EDGE.
SHOW CREATE {TAG <tag_name> | EDGE <edge_name>};\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/5.show-create-tag-edge/#examples","title":"Examples","text":"nebula> SHOW CREATE TAG player;\n+----------+-----------------------------------+\n| Tag | Create Tag |\n+----------+-----------------------------------+\n| \"player\" | \"CREATE TAG `player` ( |\n| | `name` string NULL, |\n| | `age` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+----------+-----------------------------------+\n\nnebula> SHOW CREATE EDGE follow;\n+----------+-----------------------------------+\n| Edge | Create Edge |\n+----------+-----------------------------------+\n| \"follow\" | \"CREATE EDGE `follow` ( |\n| | `degree` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+----------+-----------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/6.show-hosts/","title":"SHOW HOSTS","text":"The SHOW HOSTS
statement shows the cluster information, including the port, status, leader, partition, and version information. You can also add the service type in the statement to view the information of the specific service.
SHOW HOSTS [GRAPH | STORAGE | META];\n
Note
For a NebulaGraph cluster installed with the source code, the version of the cluster will not be displayed in the output after executing the command SHOW HOSTS (GRAPH | STORAGE | META)
with the service name.
nebula> SHOW HOSTS;\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 8 | \"docs:5, basketballplayer:3\" | \"docs:5, basketballplayer:3\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 9 | \"basketballplayer:4, docs:5\" | \"docs:5, basketballplayer:4\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3, docs:5\" | \"docs:5, basketballplayer:3\" | \"master\" |\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n\nnebula> SHOW HOSTS GRAPH;\n+-----------+------+----------+---------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-----------+------+----------+---------+--------------+---------+\n| \"graphd\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n| \"graphd1\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n| \"graphd2\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n+-----------+------+----------+---------+--------------+---------+\n\nnebula> SHOW HOSTS STORAGE;\n+-------------+------+----------+-----------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-------------+------+----------+-----------+--------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n+-------------+------+----------+-----------+--------------+---------+\n\nnebula> SHOW HOSTS META;\n+----------+------+----------+--------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+----------+------+----------+--------+--------------+---------+\n| \"metad2\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n| \"metad0\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n| \"metad1\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n+----------+------+----------+--------+--------------+---------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/","title":"SHOW INDEX STATUS","text":"The SHOW INDEX STATUS
statement shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not.
SHOW {TAG | EDGE} INDEX STATUS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/#examples","title":"Examples","text":"nebula> SHOW TAG INDEX STATUS;\n+------------------------------------+--------------+\n| Name | Index Status |\n+------------------------------------+--------------+\n| \"date1_index\" | \"FINISHED\" |\n| \"basketballplayer_all_tag_indexes\" | \"FINISHED\" |\n| \"any_shape_geo_index\" | \"FINISHED\" |\n+------------------------------------+--------------+\n\nnebula> SHOW EDGE INDEX STATUS;\n+----------------+--------------+\n| Name | Index Status |\n+----------------+--------------+\n| \"follow_index\" | \"FINISHED\" |\n+----------------+--------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/#related_topics","title":"Related topics","text":"The SHOW INDEXES
statement shows the names of existing native indexes.
SHOW {TAG | EDGE} INDEXES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/8.show-indexes/#examples","title":"Examples","text":"nebula> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\n\nnebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n+----------------+----------+---------+\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n
Legacy version compatibility
In NebulaGraph 2.x, SHOW TAG/EDGE INDEXES
only returns Names
.
The SHOW PARTS
statement shows the information of a specified partition or all partitions in a graph space.
SHOW PARTS [<part_id>];\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/9.show-parts/#examples","title":"Examples","text":"nebula> SHOW PARTS;\n+--------------+--------------------+--------------------+-------+\n| Partition ID | Leader | Peers | Losts |\n+--------------+--------------------+--------------------+-------+\n| 1 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 2 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 3 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 4 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 5 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 6 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 7 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 8 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 9 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 10 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n+--------------+--------------------+--------------------+-------+\n\nnebula> SHOW PARTS 1;\n+--------------+--------------------+--------------------+-------+\n| Partition ID | Leader | Peers | Losts |\n+--------------+--------------------+--------------------+-------+\n| 1 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n+--------------+--------------------+--------------------+-------+\n
The descriptions are as follows.
Parameter DescriptionPartition ID
The ID of the partition. Leader
The IP (or hostname) and the port of the leader. Peers
The IPs (or hostnames) and the ports of all the replicas. Losts
The IPs (or hostnames) and the ports of replicas at fault."},{"location":"3.ngql-guide/8.clauses-and-options/group-by/","title":"GROUP BY","text":"The GROUP BY
clause can be used to aggregate data.
This topic applies to native nGQL only.
You can also use the count() function to aggregate data.
nebula> MATCH (v:player)<-[:follow]-(:player) RETURN v.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;\n+----------------------+-----+\n| Name | cnt |\n+----------------------+-----+\n| \"Tim Duncan\" | 10 |\n| \"LeBron James\" | 6 |\n| \"Tony Parker\" | 5 |\n| \"Chris Paul\" | 4 |\n| \"Manu Ginobili\" | 4 |\n+----------------------+-----+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/group-by/#syntax","title":"Syntax","text":"The GROUP BY
clause groups the rows with the same value. Then operations such as counting, sorting, and calculation can be applied.
The GROUP BY
clause works after the pipe symbol (|) and before a YIELD
clause.
| GROUP BY <var> YIELD <var>, <aggregation_function(var)>\n
The aggregation_function()
function supports avg()
, sum()
, max()
, min()
, count()
, collect()
, and std()
.
The following statement finds all the vertices connected directly to vertex \"player100\"
, groups the result set by player names, and counts how many times the name shows up in the result set.
nebula> GO FROM \"player100\" OVER follow BIDIRECT \\\n YIELD properties($$).name as Name \\\n | GROUP BY $-.Name \\\n YIELD $-.Name as Player, count(*) AS Name_Count;\n+---------------------+------------+\n| Player | Name_Count |\n+---------------------+------------+\n| \"Shaquille O'Neal\" | 1 |\n| \"Tiago Splitter\" | 1 |\n| \"Manu Ginobili\" | 2 |\n| \"Boris Diaw\" | 1 |\n| \"LaMarcus Aldridge\" | 1 |\n| \"Tony Parker\" | 2 |\n| \"Marco Belinelli\" | 1 |\n| \"Dejounte Murray\" | 1 |\n| \"Danny Green\" | 1 |\n| \"Aron Baynes\" | 1 |\n+---------------------+------------+\n
The following statement finds all the vertices connected directly to vertex \"player100\"
, groups the result set by source vertices, and returns the sum of degree values.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge) AS player, properties(edge).degree AS degree \\\n | GROUP BY $-.player \\\n YIELD sum($-.degree);\n+----------------+\n| sum($-.degree) |\n+----------------+\n| 190 |\n+----------------+\n
For more information about the sum()
function, see Built-in math functions.
The usage of GROUP BY
in the above nGQL statements that explicitly write GROUP BY
and act as grouping fields is called explicit GROUP BY
, while in openCypher, the GROUP BY
is implicit, i.e., GROUP BY
groups fields without explicitly writing GROUP BY
. The explicit GROUP BY
in nGQL is the same as the implicit GROUP BY
in openCypher, and nGQL also supports the implicit GROUP BY
. For the implicit usage of GROUP BY
, see Stack Overflow.
For example, to look up the players over 34 years old with the same length of service, you can use the following statement:
nebula> LOOKUP ON player WHERE player.age > 34 YIELD id(vertex) AS v | \\\n GO FROM $-.v OVER serve YIELD serve.start_year AS start_year, serve.end_year AS end_year | \\\n YIELD $-.start_year, $-.end_year, count(*) AS count | \\\n ORDER BY $-.count DESC | LIMIT 5;\n+---------------+-------------+-------+\n| $-.start_year | $-.end_year | count |\n+---------------+-------------+-------+\n| 2018 | 2019 | 3 |\n| 2007 | 2012 | 2 |\n| 1998 | 2004 | 2 |\n| 2017 | 2018 | 2 |\n| 2010 | 2011 | 2 |\n+---------------+-------------+-------+ \n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/","title":"INNER JOIN","text":"INNER JOIN
is a type of join query that matches records based on common column values between two tables. It is commonly used to create a result set that includes two tables based on values in their associated columns. In NebulaGraph, the INNER JOIN
clause can be explicitly used to conduct join queries between two tables, leading to more complex query results.
Note
In nGQL statements, the multi-hop query of GO
implicitly utilizes the INNER JOIN
clause. For example, in the statement GO 1 TO 2 STEPS FROM \"player101\" OVER follow YIELD $$.player.name AS name, $$.player.age AS age
, the GO
clause implicitly utilizes the INNER JOIN
clause, matching the result columns of the first-hop query starting from player101
along the follow
edge with the starting columns of the second-hop query. Then, based on the matching results, it returns name
and age
.
The INNER JOIN
clause is only applicable to the native nGQL syntax.
YIELD <column_name_list>\nFROM <first_table> INNER JOIN <second_table> ON <join_condition>\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#notes","title":"Notes","text":"To conduct an INNER JOIN
query, you need to follow these rules:
YIELD
clause to specify the returned columns, and place it before the INNER JOIN
clause.FROM
clause to specify the two tables to be joined.INNER JOIN
clause must contain the ON
clause, which specifies the join condition. The join condition only supports equi-join (i.e., ==
).<first_table>
and <second_table>
are the two tables to be joined, and the two table names cannot be the same.The following examples show how to use the INNER JOIN
clause to join the results of two queries in nGQL statements.
Firstly, the dst
column obtained from the initial LOOK UP
operation (whose value for Tony Parker has an ID of player101
) is connected with the src
column obtained from the second GO
query (which has IDs player101
and player125
). By matching the two columns where player101
appears on both sides, we obtain the resulting data set. The final request then uses a YIELD
statement YIELD $b.vid AS vid, $a.v AS v, $b.e2 AS e2
to display the information.
nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as dst, vertex AS v; \\\n $b = GO FROM 'player101', 'player125' OVER follow YIELD id($^) as src, id($$) as vid, edge AS e2; \\\n YIELD $b.vid AS vid, $a.v AS v, $b.e2 AS e2 FROM $a INNER JOIN $b ON $a.dst == $b.src;\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| vid | v | e2 |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| \"player100\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n| \"player102\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| \"player125\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player125\" @0 {degree: 95}] |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#example_2","title":"Example 2","text":"The following nGQL example utilizes the INNER JOIN
clause to combine the src
column from the first LOOKUP
query (with player101
as ID for Tony Parker
) and the src
column from the second FETCH
query (with player101
being the starting point to player100
). By matching player101
in both source columns, we obtain the resulting data set. The final request then utilizes a YIELD
clause YIELD $a.src AS src, $a.v AS v, $b.e AS e
to display the information.
nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as src, vertex AS v; \\\n $b = FETCH PROP ON follow 'player101'->'player100' YIELD src(edge) as src, edge as e; \\\n YIELD $a.src AS src, $a.v AS v, $b.e AS e FROM $a INNER JOIN $b ON $a.src == $b.src;\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| src | v | e |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| \"player101\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#example_3","title":"Example 3","text":"The following example shows the process of using the INNER JOIN
clause to join the results of the LOOKUP
, GO
, and FIND PATH
clauses.
player
table using the LOOKUP ON
statement to find the vertex for player Tony Parker
, storing the ID and properties in the $a.src
and $a.v
columns, respectively.GO
statement to find player nodes that are reachable in 2-5 steps through the follow
edges from the node $a.src
. It also requires that the players corresponding to these nodes have an age greater than 30 years old. We store the IDs of these nodes in the $b.dst
column.FIND ALL PATH
statement to find all the paths that traverse the follow
edges from $a.src
to $b.dst
. We also return the paths themselves as $c.p
and the destination of each path as $c.dst
.FIND SHORTEST PATH
statement, find the shortest path from $c.dst
back to $a.src
, storing the path in $d.p
and the starting point in $d.src
.INNER JOIN
clause to join the results of steps 3 and 4 by matching the $c.dst
column with the $d.src
column. Then use the YIELD
statement YIELD $c.forward AS forwardPath, $c.dst AS end, $d.p AS backwardPath
to return the matched records of the join.nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as src, vertex AS v; \\\n $b = GO 2 TO 5 STEPS FROM $a.src OVER follow WHERE $$.player.age > 30 YIELD id($$) AS dst; \\\n $c = (FIND ALL PATH FROM $a.src TO $b.dst OVER follow YIELD path AS p | YIELD $-.p AS forward, id(endNode($-.p)) AS dst); \\\n $d = (FIND SHORTEST PATH FROM $c.dst TO $a.src OVER follow YIELD path AS p | YIELD $-.p AS p, id(startNode($-.p)) AS src); \\\n YIELD $c.forward AS forwardPath, $c.dst AS end, $d.p AS backwordPath FROM $c INNER JOIN $d ON $c.dst == $d.src;\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n| forwardPath | end | backwordPath |\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")> | \"player102\" | <(\"player102\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player102\")> | \"player102\" | <(\"player102\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n...\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/","title":"LIMIT AND SKIP","text":"The LIMIT
clause constrains the number of rows in the output. The usage of LIMIT
in native nGQL statements and openCypher compatible statements is different.
|
needs to be used before the LIMIT
clause. The offset parameter can be set or omitted directly after the LIMIT
statement.LIMIT
clause. And you can use SKIP
to indicate an offset.Note
When using LIMIT
in either syntax above, it is important to use an ORDER BY
clause that constrains the output into a unique order. Otherwise, you will get an unpredictable subset of the output.
In native nGQL, LIMIT
has general syntax and exclusive syntax in GO
statements.
In native nGQL, the general LIMIT
syntax works the same as in SQL
. The LIMIT
clause accepts one or two parameters. The values of both parameters must be non-negative integers and be used after a pipe. The syntax and description are as follows:
... | LIMIT [<offset>,] <number_rows>;\n
Parameter Description offset
The offset value. It defines the row from which to start returning. The offset starts from 0
. The default value is 0
, which returns from the first row. number_rows
It constrains the total number of returned rows. For example:
# The following example returns the top 3 rows of data from the result.\nnebula> LOOKUP ON player YIELD id(vertex)|\\\n LIMIT 3;\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player102\" |\n+-------------+\n\n# The following example returns the 3 rows of data starting from the second row of the sorted output.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD properties($$).name AS Friend, properties($$).age AS Age \\\n | ORDER BY $-.Age, $-.Friend \\\n | LIMIT 1, 3;\n+-------------------+-----+\n| Friend | Age |\n+-------------------+-----+\n| \"Danny Green\" | 31 |\n| \"Aron Baynes\" | 32 |\n| \"Marco Belinelli\" | 32 |\n+-------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#limit_in_go_statements","title":"LIMIT in GO statements","text":"In addition to the general syntax in the native nGQL, the LIMIT
in the GO
statement also supports limiting the number of output results based on edges.
Syntax:
<go_statement> LIMIT <limit_list>;\n
limit_list
is a list. Elements in the list must be natural numbers, and the number of elements must be the same as the maximum number of STEPS
in the GO
statement. The following takes GO 1 TO 3 STEPS FROM \"A\" OVER * LIMIT <limit_list>
as an example to introduce this usage of LIMIT
in detail.
limit_list
must contain 3 natural numbers, such as GO 1 TO 3 STEPS FROM \"A\" OVER * LIMIT [1,2,4]
.1
in LIMIT [1,2,4]
means that the system automatically selects 1 edge to continue traversal in the first step. 2
means to select 2 edges to continue traversal in the second step. 4
indicates that 4 edges are selected to continue traversal in the third step.GO 1 TO 3 STEPS
means to return all the traversal results from the first to third steps, all the red edges and their source and destination vertices in the figure below will be matched by this GO
statement. And the yellow edges represent there is no path selected when the GO statement traverses. If it is not GO 1 TO 3 STEPS
but GO 3 STEPS
, it will only match the red edges of the third step and the vertices at both ends.In the basketballplayer dataset, the example is as follows:
nebula> GO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n LIMIT [3,3,3];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n\nnebula> GO 3 STEPS FROM \"player102\" OVER * BIDIRECT\\\n YIELD dst(edge) \\\n LIMIT [rand32(5),rand32(5),rand32(5)];\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player100\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#limit_in_opencypher_compatible_statements","title":"LIMIT in openCypher compatible statements","text":"In openCypher compatible statements such as MATCH
, there is no need to use a pipe when LIMIT
is used. The syntax and description are as follows:
... [SKIP <offset>] [LIMIT <number_rows>];\n
Parameter Description offset
The offset value. It defines the row from which to start returning. The offset starts from 0
. The default value is 0
, which returns from the first row. number_rows
It constrains the total number of returned rows. Both offset
and number_rows
accept expressions, but the result of the expression must be a non-negative integer.
Note
Fraction expressions composed of two integers are automatically floored to integers. For example, 8/6
is floored to 1.
LIMIT
can be used alone to return a specified number of results.
nebula> MATCH (v:player) RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Age LIMIT 5;\n+-------------------------+-----+\n| Name | Age |\n+-------------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n| \"Giannis Antetokounmpo\" | 24 |\n| \"Kyle Anderson\" | 25 |\n+-------------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#examples_of_skip","title":"Examples of SKIP","text":"SKIP
can be used alone to set the offset and return the data after the specified position.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Manu Ginobili\" | 41 |\n| \"Tony Parker\" | 36 |\n+-----------------+-----+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1+1;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#example_of_skip_and_limit","title":"Example of SKIP and LIMIT","text":"SKIP
and LIMIT
can be used together to return the specified amount of data starting from the specified position.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1 LIMIT 1;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/","title":"ORDER BY","text":"The ORDER BY
clause specifies the order of the rows in the output.
|
) and an ORDER BY
clause after YIELD
clause.ORDER BY
clause follows a RETURN
clause.There are two order options:
ASC
: Ascending. ASC
is the default order.DESC
: Descending.<YIELD clause>\n| ORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...];\n
Compatibility
In the native nGQL syntax, $-.
must be used after ORDER BY
. But it is not required in releases prior to 2.5.0.
nebula> FETCH PROP ON player \"player100\", \"player101\", \"player102\", \"player103\" \\\n YIELD player.age AS age, player.name AS name \\\n | ORDER BY $-.age ASC, $-.name DESC;\n+-----+---------------------+\n| age | name |\n+-----+---------------------+\n| 32 | \"Rudy Gay\" |\n| 33 | \"LaMarcus Aldridge\" |\n| 36 | \"Tony Parker\" |\n| 42 | \"Tim Duncan\" |\n+-----+---------------------+\n\nnebula> $var = GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dst; \\\n ORDER BY $var.dst DESC;\n+-------------+\n| dst |\n+-------------+\n| \"player125\" |\n| \"player101\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#opencypher_syntax","title":"OpenCypher Syntax","text":"<RETURN clause>\nORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...];\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#examples_1","title":"Examples","text":"nebula> MATCH (v:player) RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Name DESC;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Yao Ming\" | 38 |\n| \"Vince Carter\" | 42 |\n| \"Tracy McGrady\" | 39 |\n| \"Tony Parker\" | 36 |\n| \"Tim Duncan\" | 42 |\n+-----------------+-----+\n...\n\n# In the following example, nGQL sorts the rows by age first. If multiple people are of the same age, nGQL will then sort them by name.\nnebula> MATCH (v:player) RETURN v.player.age AS Age, v.player.name AS Name \\\n ORDER BY Age DESC, Name ASC;\n+-----+-------------------+\n| Age | Name |\n+-----+-------------------+\n| 47 | \"Shaquille O'Neal\" |\n| 46 | \"Grant Hill\" |\n| 45 | \"Jason Kidd\" |\n| 45 | \"Steve Nash\" |\n+-----+-------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#order_of_null_values","title":"Order of NULL values","text":"nGQL lists NULL values at the end of the output for ascending sorting, and at the start for descending sorting.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age;\n+-----------------+----------+\n| Name | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| __NULL__ | __NULL__ |\n+-----------------+----------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC;\n+-----------------+----------+\n| Name | Age |\n+-----------------+----------+\n| __NULL__ | __NULL__ |\n| \"Manu Ginobili\" | 41 |\n| \"Tony Parker\" | 36 |\n+-----------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/","title":"RETURN","text":"The RETURN
clause defines the output of an nGQL query. To return multiple fields, separate them with commas.
RETURN
can lead a clause or a statement:
RETURN
clause can work in openCypher statements in nGQL, such as MATCH
or UNWIND
.RETURN
statement can work independently to output the result of an expression.This topic applies to the openCypher syntax in nGQL only. For native nGQL, use YIELD
.
RETURN
does not support the following openCypher features yet.
Return variables with uncommon characters, for example:
MATCH (`non-english_characters`:player) \\\nRETURN `non-english_characters`;\n
Set a pattern in the RETURN
clause and return all elements that this pattern matches, for example:
MATCH (v:player) \\\nRETURN (v)-[e]->(v2);\n
When RETURN
returns the map data structure, the order of key-value pairs is undefined.
nebula> RETURN {age: 32, name: \"Marco Belinelli\"};\n+------------------------------------+\n| {age:32,name:\"Marco Belinelli\"} |\n+------------------------------------+\n| {age: 32, name: \"Marco Belinelli\"} |\n+------------------------------------+\n\nnebula> RETURN {zage: 32, name: \"Marco Belinelli\"};\n+-------------------------------------+\n| {zage:32,name:\"Marco Belinelli\"} |\n+-------------------------------------+\n| {name: \"Marco Belinelli\", zage: 32} |\n+-------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vertices_or_edges","title":"Return vertices or edges","text":"Use the RETURN {<vertex_name> | <edge_name>}
to return vertices and edges all information.
// Return vertices\nnebula> MATCH (v:player) \\\n RETURN v;\n+---------------------------------------------------------------+\n| v |\n+---------------------------------------------------------------+\n| (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"player107\" :player{age: 32, name: \"Aron Baynes\"}) |\n| (\"player116\" :player{age: 34, name: \"LeBron James\"}) |\n| (\"player120\" :player{age: 29, name: \"James Harden\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n+---------------------------------------------------------------+\n...\n\n// Return edges\nnebula> MATCH (v:player)-[e]->() \\\n RETURN e;\n+------------------------------------------------------------------------------+\n| e |\n+------------------------------------------------------------------------------+\n| [:follow \"player104\"->\"player100\" @0 {degree: 55}] |\n| [:follow \"player104\"->\"player101\" @0 {degree: 50}] |\n| [:follow \"player104\"->\"player105\" @0 {degree: 60}] |\n| [:serve \"player104\"->\"team200\" @0 {end_year: 2009, start_year: 2007}] |\n| [:serve \"player104\"->\"team208\" @0 {end_year: 2016, start_year: 2015}] |\n+------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vids","title":"Return VIDs","text":"Use the id()
function to retrieve VIDs.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN id(v);\n+-------------+\n| id(v) |\n+-------------+\n| \"player100\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_tag","title":"Return Tag","text":"Use the labels()
function to return the list of tags on a vertex.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN labels(v);\n+------------+\n| labels(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
To retrieve the nth element in the labels(v)
list, use labels(v)[n-1]
. The following example shows how to use labels(v)[0]
to return the first tag in the list.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN labels(v)[0];\n+--------------+\n| labels(v)[0] |\n+--------------+\n| \"player\" |\n+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_properties","title":"Return properties","text":"When returning properties of a vertex, it is necessary to specify the tag to which the properties belong because a vertex can have multiple tags and the same property name can appear on different tags.
It is possible to specify the tag of a vertex to return all properties of that tag, or to specify both the tag and a property name to return only that property of the tag.
nebula> MATCH (v:player) \\\n RETURN v.player, v.player.name, v.player.age \\\n LIMIT 3;\n+--------------------------------------+---------------------+--------------+\n| v.player | v.player.name | v.player.age |\n+--------------------------------------+---------------------+--------------+\n| {age: 33, name: \"LaMarcus Aldridge\"} | \"LaMarcus Aldridge\" | 33 |\n| {age: 25, name: \"Kyle Anderson\"} | \"Kyle Anderson\" | 25 |\n| {age: 40, name: \"Kobe Bryant\"} | \"Kobe Bryant\" | 40 |\n+--------------------------------------+---------------------+--------------+\n
When returning edge properties, it is not necessary to specify the edge type to which the properties belong, because an edge can only have one edge type.
// Return the property of a vertex\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN properties(v2);\n+----------------------------------+\n| properties(v2) |\n+----------------------------------+\n| {name: \"Spurs\"} |\n| {age: 36, name: \"Tony Parker\"} |\n| {age: 41, name: \"Manu Ginobili\"} |\n+----------------------------------+\n
// Return the property of an edge\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN e.start_year, e.degree \\\n+--------------+----------+\n| e.start_year | e.degree |\n+--------------+----------+\n| __NULL__ | 95 |\n| __NULL__ | 95 |\n| 1997 | __NULL__ |\n+--------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_edge_type","title":"Return edge type","text":"Use the type()
function to return the matched edge types.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN DISTINCT type(e);\n+----------+\n| type(e) |\n+----------+\n| \"serve\" |\n| \"follow\" |\n+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_paths","title":"Return paths","text":"Use RETURN <path_name>
to return all the information of the matched paths.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[*3]->() \\\n RETURN p;\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:serve@0 {end_year: 2019, start_year: 2015}]->(\"team204\" :team{name: \"Spurs\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:serve@0 {end_year: 2015, start_year: 2006}]->(\"team203\" :team{name: \"Trail Blazers\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:follow@0 {degree: 75}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vertices_in_a_path","title":"Return vertices in a path","text":"Use the nodes()
function to return all vertices in a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN nodes(p);\n+-------------------------------------------------------------------------------------------------------------+\n| nodes(p) |\n+-------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"team204\" :team{name: \"Spurs\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player125\" :player{age: 41, name: \"Manu Ginobili\"})] |\n+-------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_edges_in_a_path","title":"Return edges in a path","text":"Use the relationships()
function to return all edges in a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN relationships(p);\n+-------------------------------------------------------------------------+\n| relationships(p) |\n+-------------------------------------------------------------------------+\n| [[:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}]] |\n| [[:follow \"player100\"->\"player101\" @0 {degree: 95}]] |\n| [[:follow \"player100\"->\"player125\" @0 {degree: 95}]] |\n+-------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_path_length","title":"Return path length","text":"Use the length()
function to return the length of a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[*..2]->(v2) \\\n RETURN p AS Paths, length(p) AS Length;\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n| Paths | Length |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:serve@0 {end_year: 2018, start_year: 1999}]->(\"team204\" :team{name: \"Spurs\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:serve@0 {end_year: 2019, start_year: 2018}]->(\"team215\" :team{name: \"Hornets\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 95}]->(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})-[:serve@0 {end_year: 2018, start_year: 2002}]->(\"team204\" :team{name: \"Spurs\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})-[:follow@0 {degree: 90}]->(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> | 2 |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_all_elements","title":"Return all elements","text":"To return all the elements that this pattern matches, use an asterisk (*).
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN *;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\n RETURN *;\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n| v | e | v2 |\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:follow \"player100\"->\"player101\" @0 {degree: 95}] | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:follow \"player100\"->\"player125\" @0 {degree: 95}] | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] | (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#rename_a_field","title":"Rename a field","text":"Use the AS <alias>
syntax to rename a field in the output.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[:serve]->(v2) \\\n RETURN v2.team.name AS Team;\n+---------+\n| Team |\n+---------+\n| \"Spurs\" |\n+---------+\n\nnebula> RETURN \"Amber\" AS Name;\n+---------+\n| Name |\n+---------+\n| \"Amber\" |\n+---------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_a_non-existing_property","title":"Return a non-existing property","text":"If a property matched does not exist, NULL
is returned.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\n RETURN v2.player.name, type(e), v2.player.age;\n+-----------------+----------+---------------+\n| v2.player.name | type(e) | v2.player.age |\n+-----------------+----------+---------------+\n| \"Manu Ginobili\" | \"follow\" | 41 |\n| __NULL__ | \"serve\" | __NULL__ |\n| \"Tony Parker\" | \"follow\" | 36 |\n+-----------------+----------+---------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_expression_results","title":"Return expression results","text":"To return the results of expressions such as literals, functions, or predicates, set them in a RETURN
clause.
nebula> MATCH (v:player{name:\"Tony Parker\"})-->(v2:player) \\\n RETURN DISTINCT v2.player.name, \"Hello\"+\" graphs!\", v2.player.age > 35;\n+---------------------+----------------------+--------------------+\n| v2.player.name | (\"Hello\"+\" graphs!\") | (v2.player.age>35) |\n+---------------------+----------------------+--------------------+\n| \"LaMarcus Aldridge\" | \"Hello graphs!\" | false |\n| \"Tim Duncan\" | \"Hello graphs!\" | true |\n| \"Manu Ginobili\" | \"Hello graphs!\" | true |\n+---------------------+----------------------+--------------------+\n\nnebula> RETURN 1+1;\n+-------+\n| (1+1) |\n+-------+\n| 2 |\n+-------+\n\nnebula> RETURN 1- -1;\n+----------+\n| (1--(1)) |\n+----------+\n| 2 |\n+----------+\n\nnebula> RETURN 3 > 1;\n+-------+\n| (3>1) |\n+-------+\n| true |\n+-------+\n\nnebula> RETURN 1+1, rand32(1, 5);\n+-------+-------------+\n| (1+1) | rand32(1,5) |\n+-------+-------------+\n| 2 | 1 |\n+-------+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_unique_fields","title":"Return unique fields","text":"Use DISTINCT
to remove duplicate fields in the result set.
# Before using DISTINCT.\nnebula> MATCH (v:player{name:\"Tony Parker\"})--(v2:player) \\\n RETURN v2.player.name, v2.player.age;\n+---------------------+---------------+\n| v2.player.name | v2.player.age |\n+---------------------+---------------+\n| \"Manu Ginobili\" | 41 |\n| \"Boris Diaw\" | 36 |\n| \"Marco Belinelli\" | 32 |\n| \"Dejounte Murray\" | 29 |\n| \"Tim Duncan\" | 42 |\n| \"Tim Duncan\" | 42 |\n| \"LaMarcus Aldridge\" | 33 |\n| \"LaMarcus Aldridge\" | 33 |\n+---------------------+---------------+\n\n# After using DISTINCT.\nnebula> MATCH (v:player{name:\"Tony Parker\"})--(v2:player) \\\n RETURN DISTINCT v2.player.name, v2.player.age;\n+---------------------+---------------+\n| v2.player.name | v2.player.age |\n+---------------------+---------------+\n| \"Manu Ginobili\" | 41 |\n| \"Boris Diaw\" | 36 |\n| \"Marco Belinelli\" | 32 |\n| \"Dejounte Murray\" | 29 |\n| \"Tim Duncan\" | 42 |\n| \"LaMarcus Aldridge\" | 33 |\n+---------------------+---------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/sample/","title":"SAMPLE","text":"The SAMPLE
clause takes samples evenly in the result set and returns the specified amount of data.
SAMPLE
can be used in GO
statements only. The syntax is as follows:
<go_statement> SAMPLE <sample_list>;\n
sample_list
is a list. Elements in the list must be natural numbers, and the number of elements must be the same as the maximum number of STEPS
in the GO
statement. The following takes GO 1 TO 3 STEPS FROM \"A\" OVER * SAMPLE <sample_list>
as an example to introduce this usage of SAMPLE
in detail.
sample_list
must contain 3 natural numbers, such as GO 1 TO 3 STEPS FROM \"A\" OVER * SAMPLE [1,2,4]
.1
in SAMPLE [1,2,4]
means that the system automatically selects 1 edge to continue traversal in the first step. 2
means to select 2 edges to continue traversal in the second step. 4
indicates that 4 edges are selected to continue traversal in the third step. If there is no matched edge in a certain step or the number of matched edges is less than the specified number, the actual number will be returned.GO 1 TO 3 STEPS
means to return all the traversal results from the first to third steps, all the red edges and their source and destination vertices in the figure below will be matched by this GO
statement. And the yellow edges represent there is no path selected when the GO statement traverses. If it is not GO 1 TO 3 STEPS
but GO 3 STEPS
, it will only match the red edges of the third step and the vertices at both ends.In the basketballplayer dataset, the example is as follows:
nebula> GO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n SAMPLE [1,2,3];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n\nnebula> GO 1 TO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n SAMPLE [2,2,2];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n| \"Tim Duncan\" | 42 |\n| \"Spurs\" | __NULL__ |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/","title":"TTL","text":"TTL (Time To Live) is a mechanism in NebulaGraph that defines the lifespan of data. Once the data reaches its predefined lifespan, it is automatically deleted from the database. This feature is particularly suitable for data that only needs temporary storage, such as temporary sessions or cached data.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#opencypher_compatibility","title":"OpenCypher Compatibility","text":"This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#precautions","title":"Precautions","text":"TTL options and indexes have coexistence issues.
The native nGQL TTL feature has the following options.
Option Descriptionttl_col
Specifies an existing property to set a lifespan on. The data type of the property must be int
or timestamp
. ttl_duration
Specifies the timeout adds-on value in seconds. The value must be a non-negative int64 number. A property expires if the sum of its value and the ttl_duration
value is smaller than the current timestamp. If the ttl_duration
value is 0
, the property never expires.You can set ttl_use_ms
to true
in the configuration file nebula-storaged.conf
(default path: /usr/local/nightly/etc/
) to set the default unit to milliseconds. Warning
ttl_use_ms
to true
, make sure that no TTL has been set for any property, as shortening the expiration time may cause data to be erroneously deleted.ttl_use_ms
to true
, which sets the default TTL unit to milliseconds, the data type of the property specified by ttl_col
must be int
, and the property value needs to be manually converted to milliseconds. For example, when setting ttl_col
to a
, you need to convert the value of a
to milliseconds, such as when the value of a
is now()
, you need to set the value of a
to now() * 1000
.You must use the TTL options together to set a lifespan on a property.
Before using the TTL feature, you must first create a timestamp or integer property and specify it in the TTL options. NebulaGraph will not automatically create or manage this timestamp property for you.
When inserting the value of the timestamp or integer property, it is recommended to use the now()
function or the current timestamp to represent the present time.
If a tag or an edge type is already created, to set a timeout on a property bound to the tag or edge type, use ALTER
to update the tag or edge type.
# Create a tag.\nnebula> CREATE TAG IF NOT EXISTS t1 (a timestamp);\n\n# Use ALTER to update the tag and set the TTL options.\nnebula> ALTER TAG t1 TTL_COL = \"a\", TTL_DURATION = 5;\n\n# Insert a vertex with tag t1. The vertex expires 5 seconds after the insertion.\nnebula> INSERT VERTEX t1(a) VALUES \"101\":(now());\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#set_a_timeout_when_creating_a_tag_or_an_edge_type","title":"Set a timeout when creating a tag or an edge type","text":"Use TTL options in the CREATE
statement to set a timeout when creating a tag or an edge type. For more information, see CREATE TAG and CREATE EDGE.
# Create a tag and set the TTL options.\nnebula> CREATE TAG IF NOT EXISTS t2(a int, b int, c string) TTL_DURATION= 100, TTL_COL = \"a\";\n\n# Insert a vertex with tag t2. The timeout timestamp is 1648197238 (1648197138 + 100).\nnebula> INSERT VERTEX t2(a, b, c) VALUES \"102\":(1648197138, 30, \"Hello\");\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#data_expiration_and_deletion","title":"Data expiration and deletion","text":"Caution
NULL
, the property never expires. now()
is added to a tag or an edge type and the TTL options are set for the property, the history data related to the tag or the edge type will never expire because the value of that property for the history data is the current timestamp.Vertex property expiration has the following impact.
Since an edge can have only one edge type, once an edge property expires, the edge expires.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#data_deletion","title":"Data deletion","text":"The expired data are still stored on the disk, but queries will filter them out.
NebulaGraph automatically deletes the expired data and reclaims the disk space during the next compaction.
Note
If TTL is disabled, the corresponding data deleted after the last compaction can be queried again.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#remove_a_timeout","title":"Remove a timeout","text":"To disable TTL and remove the timeout on a property, you can use the following approaches.
nebula> ALTER TAG t1 DROP (a);\n
ttl_col
to an empty string.nebula> ALTER TAG t1 TTL_COL = \"\";\n
ttl_duration
to 0
. This operation keeps the TTL options and prevents the property from expiring and the property schema from being modified.nebula> ALTER TAG t1 TTL_DURATION = 0;\n
UNWIND
transform a list into a sequence of rows.
UNWIND
can be used as an individual statement or as a clause within a statement.
UNWIND <list> AS <alias> <RETURN clause>;\n
"},{"location":"3.ngql-guide/8.clauses-and-options/unwind/#examples","title":"Examples","text":"To transform a list.
nebula> UNWIND [1,2,3] AS n RETURN n;\n+---+\n| n |\n+---+\n| 1 |\n| 2 |\n| 3 |\n+---+\n
The UNWIND
clause in native nGQL statements.
Note
To use a UNWIND
clause in a native nGQL statement, use it after the |
operator and use the $-
prefix for variables. If you use a statement or clause after the UNWIND
clause, use the |
operator and use the $-
prefix for variables.
<statement> | UNWIND $-.<var> AS <alias> <|> <clause>;\n
The UNWIND
clause in openCypher statements.
<statement> UNWIND <list> AS <alias> <RETURN clause>\uff1b\n
To transform a list of duplicates into a unique set of rows using WITH DISTINCT
in a UNWIND
clause.
Note
WITH DISTINCT
is not available in native nGQL statements.
// Transform the list `[1,1,2,2,3,3]` into a unique set of rows, sort the rows, and then transform the rows into a list of unique values.\n\nnebula> WITH [1,1,2,2,3,3] AS n \\\n UNWIND n AS r \\\n WITH DISTINCT r AS r \\\n ORDER BY r \\\n RETURN collect(r);\n+------------+\n| collect(r) |\n+------------+\n| [1, 2, 3] |\n+------------+\n
To use an UNWIND
clause in a MATCH
statement.
// Get a list of the vertices in the matched path, transform the list into a unique set of rows, and then transform the rows into a list. \n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})--(v2) \\\n WITH nodes(p) AS n \\\n UNWIND n AS r \\\n WITH DISTINCT r AS r \\\n RETURN collect(r);\n+----------------------------------------------------------------------------------------------------------------------+\n| collect(r) |\n+----------------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"}), |\n|(\"team204\" :team{name: \"Spurs\"}), (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}), |\n|(\"player125\" :player{age: 41, name: \"Manu Ginobili\"}), (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}), |\n|(\"player144\" :player{age: 47, name: \"Shaquile O'Neal\"}), (\"player105\" :player{age: 31, name: \"Danny Green\"}), |\n|(\"player113\" :player{age: 29, name: \"Dejounte Murray\"}), (\"player107\" :player{age: 32, name: \"Aron Baynes\"}), |\n|(\"player109\" :player{age: 34, name: \"Tiago Splitter\"}), (\"player108\" :player{age: 36, name: \"Boris Diaw\"})] | \n+----------------------------------------------------------------------------------------------------------------------+\n
To use an UNWIND
clause in a GO
statement.
// Query the vertices in a list for the corresponding edges with a specified statement.\n\nnebula> YIELD ['player101', 'player100'] AS a | UNWIND $-.a AS b | GO FROM $-.b OVER follow YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player101\"->\"player125\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+----------------------------------------------------+\n
To use an UNWIND
clause in a LOOKUP
statement.
// Find all the properties of players whose age is greater than 46, get a list of unique properties, and then transform the list into rows. \n\nnebula> LOOKUP ON player \\\n WHERE player.age > 46 \\\n YIELD DISTINCT keys(vertex) as p | UNWIND $-.p as a | YIELD $-.a AS a;\n+--------+\n| a |\n+--------+\n| \"age\" |\n| \"name\" |\n+--------+\n
To use an UNWIND
clause in a FETCH
statement.
// Query player101 for all tags related to player101, get a list of the tags and then transform the list into rows.\n\nnebula> CREATE TAG hero(like string, height int);\n INSERT VERTEX hero(like, height) VALUES \"player101\":(\"deep\", 182);\n FETCH PROP ON * \"player101\" \\\n YIELD tags(vertex) as t | UNWIND $-.t as a | YIELD $-.a AS a;\n+----------+\n| a |\n+----------+\n| \"hero\" |\n| \"player\" |\n+----------+\n
To use an UNWIND
clause in a GET SUBGRAPH
statement.
// Get the subgraph including outgoing and incoming serve edges within 0~2 hops from/to player100, and transform the result into rows.\n\nnebula> GET SUBGRAPH 2 STEPS FROM \"player100\" BOTH serve \\\n YIELD edges as e | UNWIND $-.e as a | YIELD $-.a AS a;\n+----------------------------------------------+\n| a |\n+----------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {}] |\n| [:serve \"player101\"->\"team204\" @0 {}] |\n| [:serve \"player102\"->\"team204\" @0 {}] |\n| [:serve \"player103\"->\"team204\" @0 {}] |\n| [:serve \"player105\"->\"team204\" @0 {}] |\n| [:serve \"player106\"->\"team204\" @0 {}] |\n| [:serve \"player107\"->\"team204\" @0 {}] |\n| [:serve \"player108\"->\"team204\" @0 {}] |\n| [:serve \"player109\"->\"team204\" @0 {}] |\n| [:serve \"player110\"->\"team204\" @0 {}] |\n| [:serve \"player111\"->\"team204\" @0 {}] |\n| [:serve \"player112\"->\"team204\" @0 {}] |\n| [:serve \"player113\"->\"team204\" @0 {}] |\n| [:serve \"player114\"->\"team204\" @0 {}] |\n| [:serve \"player125\"->\"team204\" @0 {}] |\n| [:serve \"player138\"->\"team204\" @0 {}] |\n| [:serve \"player104\"->\"team204\" @20132015 {}] |\n| [:serve \"player104\"->\"team204\" @20182019 {}] |\n+----------------------------------------------+\n
To use an UNWIND
clause in a FIND PATH
statement.
// Find all the vertices in the shortest path from player101 to team204 along the serve edge, and transform the result into rows. \n\nnebula> FIND SHORTEST PATH FROM \"player101\" TO \"team204\" OVER serve \\\n YIELD path as p | YIELD nodes($-.p) AS nodes | UNWIND $-.nodes AS a | YIELD $-.a AS a;\n+---------------+\n| a |\n+---------------+\n| (\"player101\") |\n| (\"team204\") |\n+---------------+\n
The WHERE
clause filters the output by conditions.
The WHERE
clause usually works in the following queries:
GO
and LOOKUP
.MATCH
and WITH
.Filtering on edge rank is a native nGQL feature. To retrieve the rank value in openCypher statements, use the rank() function, such as MATCH (:player)-[e:follow]->() RETURN rank(e);
.
Note
In the following examples, $$
and $^
are reference operators. For more information, see Operators.
Use the boolean operators NOT
, AND
, OR
, and XOR
to define conditions in WHERE
clauses. For the precedence of the operators, see Precedence.
nebula> MATCH (v:player) \\\n WHERE v.player.name == \"Tim Duncan\" \\\n XOR (v.player.age < 30 AND v.player.name == \"Yao Ming\") \\\n OR NOT (v.player.name == \"Yao Ming\" OR v.player.name == \"Tim Duncan\") \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n...\n
nebula> GO FROM \"player100\" \\\n OVER follow \\\n WHERE properties(edge).degree > 90 \\\n OR properties($$).age != 33 \\\n AND properties($$).name != \"Tony Parker\" \\\n YIELD properties($$);\n+----------------------------------+\n| properties($$) |\n+----------------------------------+\n| {age: 41, name: \"Manu Ginobili\"} |\n+----------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_properties","title":"Filter on properties","text":"Use vertex or edge properties to define conditions in WHERE
clauses.
nebula> MATCH (v:player)-[e]->(v2) \\\n WHERE v2.player.age < 25 \\\n RETURN v2.player.name, v2.player.age;\n+----------------------+---------------+\n| v2.player.name | v2.player.age |\n+----------------------+---------------+\n| \"Ben Simmons\" | 22 |\n| \"Luka Doncic\" | 20 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+---------------+\n
nebula> GO FROM \"player100\" OVER follow \\\n WHERE $^.player.age >= 42 \\\n YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
nebula> MATCH (v:player)-[e]->() \\\n WHERE e.start_year < 2000 \\\n RETURN DISTINCT v.player.name, v.player.age;\n+--------------------+--------------+\n| v.player.name | v.player.age |\n+--------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tim Duncan\" | 42 |\n| \"Grant Hill\" | 46 |\n...\n
nebula> GO FROM \"player100\" OVER follow \\\n WHERE follow.degree > 90 \\\n YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
nebula> MATCH (v:player) \\\n WHERE v[toLower(\"AGE\")] < 21 \\\n RETURN v.player.name, v.player.age;\n+---------------+-------+\n| v.name | v.age |\n+---------------+-------+\n| \"Luka Doncic\" | 20 |\n+---------------+-------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_existing_properties","title":"Filter on existing properties","text":"nebula> MATCH (v:player) \\\n WHERE exists(v.player.age) \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_edge_rank","title":"Filter on edge rank","text":"In nGQL, if a group of edges has the same source vertex, destination vertex, and properties, the only thing that distinguishes them is the rank. Use rank conditions in WHERE
clauses to filter such edges.
# The following example creates test data.\nnebula> CREATE SPACE IF NOT EXISTS test (vid_type=FIXED_STRING(30));\nnebula> USE test;\nnebula> CREATE EDGE IF NOT EXISTS e1(p1 int);\nnebula> CREATE TAG IF NOT EXISTS person(p1 int);\nnebula> INSERT VERTEX person(p1) VALUES \"1\":(1);\nnebula> INSERT VERTEX person(p1) VALUES \"2\":(2);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@0:(10);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@1:(11);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@2:(12);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@3:(13);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@4:(14);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@5:(15);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@6:(16);\n\n# The following example use rank to filter edges and retrieves edges with a rank greater than 2.\nnebula> GO FROM \"1\" \\\n OVER e1 \\\n WHERE rank(edge) > 2 \\\n YIELD src(edge), dst(edge), rank(edge) AS Rank, properties(edge).p1 | \\\n ORDER BY $-.Rank DESC;\n+-----------+-----------+------+---------------------+\n| src(EDGE) | dst(EDGE) | Rank | properties(EDGE).p1 |\n+-----------+-----------+------+---------------------+\n| \"1\" | \"2\" | 6 | 16 |\n| \"1\" | \"2\" | 5 | 15 |\n| \"1\" | \"2\" | 4 | 14 |\n| \"1\" | \"2\" | 3 | 13 |\n+-----------+-----------+------+---------------------+\n\n# Filter edges by rank. Find follow edges with rank equal to 0.\nnebula> MATCH (v)-[e:follow]->() \\\n WHERE rank(e)==0 \\\n RETURN *;\n+------------------------------------------------------------+-----------------------------------------------------+\n| v | e |\n+------------------------------------------------------------+-----------------------------------------------------+\n| (\"player142\" :player{age: 29, name: \"Klay Thompson\"}) | [:follow \"player142\"->\"player117\" @0 {degree: 90}] |\n| (\"player139\" :player{age: 34, name: \"Marc Gasol\"}) | [:follow \"player139\"->\"player138\" @0 {degree: 99}] |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) | [:follow \"player108\"->\"player100\" @0 {degree: 80}] |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) | [:follow \"player108\"->\"player101\" @0 {degree: 80}] |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_pattern","title":"Filter on pattern","text":"nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(t) \\\n WHERE (v)-[e]->(t:team) \\\n RETURN (v)-->();\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| (v)-->() = (v)-->() |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(t) \\\n WHERE NOT (v)-[e]->(t:team) \\\n RETURN (v)-->();\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| (v)-->() = (v)-->() |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_strings","title":"Filter on strings","text":"Use STARTS WITH
, ENDS WITH
, or CONTAINS
in WHERE
clauses to match a specific part of a string. String matching is case-sensitive.
STARTS WITH
","text":"STARTS WITH
will match the beginning of a string.
The following example uses STARTS WITH \"T\"
to retrieve the information of players whose name starts with T
.
nebula> MATCH (v:player) \\\n WHERE v.player.name STARTS WITH \"T\" \\\n RETURN v.player.name, v.player.age;\n+------------------+--------------+\n| v.player.name | v.player.age |\n+------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tiago Splitter\" | 34 |\n| \"Tim Duncan\" | 42 |\n| \"Tracy McGrady\" | 39 |\n+------------------+--------------+\n
If you use STARTS WITH \"t\"
in the preceding statement, an empty set is returned because no name in the dataset starts with the lowercase t
.
nebula> MATCH (v:player) \\\n WHERE v.player.name STARTS WITH \"t\" \\\n RETURN v.player.name, v.player.age;\n+---------------+--------------+\n| v.player.name | v.player.age |\n+---------------+--------------+\n+---------------+--------------+\nEmpty set (time spent 5080/6474 us)\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#ends_with","title":"ENDS WITH
","text":"ENDS WITH
will match the ending of a string.
The following example uses ENDS WITH \"r\"
to retrieve the information of players whose name ends with r
.
nebula> MATCH (v:player) \\\n WHERE v.player.name ENDS WITH \"r\" \\\n RETURN v.player.name, v.player.age;\n+------------------+--------------+\n| v.player.name | v.player.age |\n+------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tiago Splitter\" | 34 |\n| \"Vince Carter\" | 42 |\n+------------------+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#contains","title":"CONTAINS
","text":"CONTAINS
will match a certain part of a string.
The following example uses CONTAINS \"Pa\"
to match the information of players whose name contains Pa
.
nebula> MATCH (v:player) \\\n WHERE v.player.name CONTAINS \"Pa\" \\\n RETURN v.player.name, v.player.age;\n+---------------+--------------+\n| v.player.name | v.player.age |\n+---------------+--------------+\n| \"Paul George\" | 28 |\n| \"Tony Parker\" | 36 |\n| \"Paul Gasol\" | 38 |\n| \"Chris Paul\" | 33 |\n+---------------+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#negative_string_matching","title":"Negative string matching","text":"You can use the boolean operator NOT
to negate a string matching condition.
nebula> MATCH (v:player) \\\n WHERE NOT v.player.name ENDS WITH \"R\" \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n| \"Russell Westbrook\" | 30 |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_lists","title":"Filter on lists","text":""},{"location":"3.ngql-guide/8.clauses-and-options/where/#match_values_in_a_list","title":"Match values in a list","text":"Use the IN
operator to check if a value is in a specific list.
nebula> MATCH (v:player) \\\n WHERE v.player.age IN range(20,25) \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Ben Simmons\" | 22 |\n| \"Giannis Antetokounmpo\" | 24 |\n| \"Kyle Anderson\" | 25 |\n| \"Joel Embiid\" | 25 |\n| \"Kristaps Porzingis\" | 23 |\n| \"Luka Doncic\" | 20 |\n+-------------------------+--------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.age IN [25,28] \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Kyle Anderson\" | 25 |\n| \"Damian Lillard\" | 28 |\n| \"Joel Embiid\" | 25 |\n| \"Paul George\" | 28 |\n| \"Ricky Rubio\" | 28 |\n+-------------------------+------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#match_values_not_in_a_list","title":"Match values not in a list","text":"Use NOT
before IN
to rule out the values in a list.
nebula> MATCH (v:player) \\\n WHERE v.player.age NOT IN range(20,25) \\\n RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Age;\n+---------------------+-----+\n| Name | Age |\n+---------------------+-----+\n| \"Kyrie Irving\" | 26 |\n| \"Cory Joseph\" | 27 |\n| \"Damian Lillard\" | 28 |\n| \"Paul George\" | 28 |\n| \"Ricky Rubio\" | 28 |\n+---------------------+-----+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/","title":"WITH","text":"The WITH
clause can retrieve the output from a query part, process it, and pass it to the next query part as the input.
This topic applies to openCypher syntax only.
Note
WITH
has a similar function with the Pipe symbol in native nGQL, but they work in different ways. DO NOT use pipe symbols in the openCypher syntax or use WITH
in native nGQL statements.
Use a WITH
clause to combine statements and transfer the output of a statement as the input of another statement.
The following statement:
nodes()
function.nebula> MATCH p=(v:player{name:\"Tim Duncan\"})--() \\\n WITH nodes(p) AS n \\\n UNWIND n AS n1 \\\n RETURN DISTINCT n1;\n+-----------------------------------------------------------+\n| n1 |\n+-----------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player105\" :player{age: 31, name: \"Danny Green\"}) |\n| (\"player113\" :player{age: 29, name: \"Dejounte Murray\"}) |\n| (\"player107\" :player{age: 32, name: \"Aron Baynes\"}) |\n| (\"player109\" :player{age: 34, name: \"Tiago Splitter\"}) |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#example_2","title":"Example 2","text":"The following statement:
player100
.labels()
function.nebula> MATCH (v) \\\n WHERE id(v)==\"player100\" \\\n WITH labels(v) AS tags_unf \\\n UNWIND tags_unf AS tags_f \\\n RETURN tags_f;\n+----------+\n| tags_f |\n+----------+\n| \"player\" |\n+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#filter_composite_queries","title":"Filter composite queries","text":"WITH
can work as a filter in the middle of a composite query.
nebula> MATCH (v:player)-->(v2:player) \\\n WITH DISTINCT v2 AS v2, v2.player.age AS Age \\\n ORDER BY Age \\\n WHERE Age<25 \\\n RETURN v2.player.name AS Name, Age;\n+----------------------+-----+\n| Name | Age |\n+----------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#process_the_output_before_using_collect","title":"Process the output before using collect()","text":"Use a WITH
clause to sort and limit the output before using collect()
to transform the output into a list.
nebula> MATCH (v:player) \\\n WITH v.player.name AS Name \\\n ORDER BY Name DESC \\\n LIMIT 3 \\\n RETURN collect(Name);\n+-----------------------------------------------+\n| collect(Name) |\n+-----------------------------------------------+\n| [\"Yao Ming\", \"Vince Carter\", \"Tracy McGrady\"] |\n+-----------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#use_with_return","title":"Use with RETURN","text":"Set an alias using a WITH
clause, and then output the result through a RETURN
clause.
nebula> WITH [1, 2, 3] AS `list` RETURN 3 IN `list` AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> WITH 4 AS one, 3 AS two RETURN one > two AS result;\n+--------+\n| result |\n+--------+\n| true |\n+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/yield/","title":"YIELD","text":"YIELD
defines the output of an nGQL query.
YIELD
can lead a clause or a statement:
YIELD
clause works in nGQL statements such as GO
, FETCH
, or LOOKUP
and must be defined to return the result.YIELD
statement works in a composite query or independently.This topic applies to native nGQL only. For the openCypher syntax, use RETURN
.
YIELD
has different functions in openCypher and nGQL.
In openCypher, YIELD
is used in the CALL[\u2026YIELD]
clause to specify the output of the procedure call.
Note
NGQL does not support CALL[\u2026YIELD]
yet.
YIELD
works like RETURN
in openCypher.Note
In the following examples, $$
and $-
are property references. For more information, see Reference to properties.
YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...];\n
Parameter Description DISTINCT
Aggregates the output and makes the statement return a distinct result set. col
A field to be returned. If no alias is set, col
will be a column name in the output. alias
An alias for col
. It is set after the keyword AS
and will be a column name in the output."},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_yield_clause_in_a_statement","title":"Use a YIELD clause in a statement","text":"YIELD
with GO
:nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Friend, properties($$).age AS Age;\n+-----------------+-----+\n| Friend | Age |\n+-----------------+-----+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
YIELD
with FETCH
:nebula> FETCH PROP ON player \"player100\" \\\n YIELD properties(vertex).name;\n+-------------------------+\n| properties(VERTEX).name |\n+-------------------------+\n| \"Tim Duncan\" |\n+-------------------------+\n
YIELD
with LOOKUP
:nebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Tony Parker\" | 36 |\n+-------------------------+------------------------+\n
YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...]\n[WHERE <conditions>];\n
Parameter Description DISTINCT
Aggregates the output and makes the statement return a distinct result set. col
A field to be returned. If no alias is set, col
will be a column name in the output. alias
An alias for col
. It is set after the keyword AS
and will be a column name in the output. conditions
Conditions set in a WHERE
clause to filter the output. For more information, see WHERE
."},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_yield_statement_in_a_composite_query","title":"Use a YIELD statement in a composite query","text":"In a composite query, a YIELD
statement accepts, filters, and modifies the result set of the preceding statement, and then outputs it.
The following query finds the players that \"player100\" follows and calculates their average age.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS ID \\\n | FETCH PROP ON player $-.ID \\\n YIELD properties(vertex).age AS Age \\\n | YIELD AVG($-.Age) as Avg_age, count(*)as Num_friends;\n+---------+-------------+\n| Avg_age | Num_friends |\n+---------+-------------+\n| 38.5 | 2 |\n+---------+-------------+\n
The following query finds the players that \"player101\" follows with the follow degrees greater than 90.
nebula> $var1 = GO FROM \"player101\" OVER follow \\\n YIELD properties(edge).degree AS Degree, dst(edge) as ID; \\\n YIELD $var1.ID AS ID WHERE $var1.Degree > 90;\n+-------------+\n| ID |\n+-------------+\n| \"player100\" |\n| \"player125\" |\n+-------------+\n
The following query finds the vertices in the player that are older than 30 and younger than 32, and returns the de-duplicate results.
nebula> LOOKUP ON player \\\n WHERE player.age < 32 and player.age >30 \\\n YIELD DISTINCT properties(vertex).age as v;\n+--------+\n| v |\n+--------+\n| 31 |\n+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_standalone_yield_statement","title":"Use a standalone YIELD statement","text":"A YIELD
statement can calculate a valid expression and output the result.
nebula> YIELD rand32(1, 6);\n+-------------+\n| rand32(1,6) |\n+-------------+\n| 3 |\n+-------------+\n\nnebula> YIELD \"Hel\" + \"\\tlo\" AS string1, \", World!\" AS string2;\n+-------------+------------+\n| string1 | string2 |\n+-------------+------------+\n| \"Hel lo\" | \", World!\" |\n+-------------+------------+\n\nnebula> YIELD hash(\"Tim\") % 100;\n+-----------------+\n| (hash(Tim)%100) |\n+-----------------+\n| 42 |\n+-----------------+\n\nnebula> YIELD \\\n CASE 2+3 \\\n WHEN 4 THEN 0 \\\n WHEN 5 THEN 1 \\\n ELSE -1 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n\nnebula> YIELD 1- -1;\n+----------+\n| (1--(1)) |\n+----------+\n| 2 |\n+----------+\n
"},{"location":"3.ngql-guide/9.space-statements/1.create-space/","title":"CREATE SPACE","text":"Graph spaces are used to store data in a physically isolated way in NebulaGraph, which is similar to the database concept in MySQL. The CREATE SPACE
statement can create a new graph space or clone the schema of an existing graph space.
Only the God role can use the CREATE SPACE
statement. For more information, see AUTHENTICATION.
CREATE SPACE [IF NOT EXISTS] <graph_space_name> (\n [partition_num = <partition_number>,]\n [replica_factor = <replica_number>,]\n vid_type = {FIXED_STRING(<N>) | INT[64]}\n )\n [COMMENT = '<comment>']\n
Parameter Description IF NOT EXISTS
Detects if the related graph space exists. If it does not exist, a new one will be created. The graph space existence detection here only compares the graph space name (excluding properties). <graph_space_name>
1. Uniquely identifies a graph space in a NebulaGraph instance. 2. Space names cannot be modified after they are set. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name a space in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a space name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . partition_num
Specifies the number of partitions in each replica. The suggested value is 20 times (2 times for HDD) the number of the hard disks in the cluster. For example, if you have three hard disks in the cluster, we recommend that you set 60 partitions. The default value is 100. replica_factor
Specifies the number of replicas in the cluster. The suggested number is 3 in a production environment and 1 in a test environment. The replica number must be an odd number for the need of quorum-based voting. The default value is 1. vid_type
A required parameter. Specifies the VID type in a graph space. Available values are FIXED_STRING(N)
and INT64
. INT
equals to INT64
. `FIXED_STRING(<N>)
specifies the VID as a string, while INT64
specifies it as an integer. N
represents the maximum length of the VIDs. If you set a VID that is longer than N
bytes, NebulaGraph throws an error. Note, for UTF-8 chars, the length may vary in different cases, i.e. a UTF-8 Chinese char is 3 byte, this means 11 Chinese chars(length-33) will exeed a FIXED_STRING(32) vid defination. COMMENT
The remarks of the graph space. The maximum length is 256 bytes. By default, there is no comments on a space. Caution
Restrictions on VID type change and VID length:
INT64
, and the String type is not allowed. For NebulaGraph v2.x, both INT64
and FIXED_STRING(<N>)
VID types are allowed. You must specify the VID type when creating a graph space, and use the same VID type in INSERT
statements, otherwise, an error message Wrong vertex id type: 1001
occurs.N
characters. If it exceeds N
, NebulaGraph throws The VID must be a 64-bit integer or a string fitting space vertex id length limit.
.If the Host not enough!
error appears, the immediate cause is that the number of online storage hosts is less than the value of replica_factor
specified when creating a graph space. In this case, you can use the SHOW HOSTS
command to see if the following situations occur:
replica_factor
can only be specified to 1
. Or create a graph space after storage hosts are scaled out. ADD HOSTS
is not executed to activate it. In this case, run SHOW HOSTS
to locate the new storage host information and then run ADD HOSTS
to activate it. A graph space can be created after there are enough storage hosts.SHOW HOSTS
, troubleshooting is needed.Legacy version compatibility
For NebulaGraph v2.x before v2.5.0, vid_type
is optional and defaults to FIXED_STRING(8)
.
Note
graph_space_name
, partition_num
, replica_factor
, vid_type
, and comment
cannot be modified once set. To modify them, drop the current working graph space with DROP SPACE
and create a new one with CREATE SPACE
.
CREATE SPACE [IF NOT EXISTS] <new_graph_space_name> AS <old_graph_space_name>;\n
Parameter Description IF NOT EXISTS
Detects if the new graph space exists. If it does not exist, the new one will be created. The graph space existence detection here only compares the graph space name (excluding properties). <new_graph_space_name>
The name of the graph space that is newly created. By default, the space name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. But special characters can only use underscore, and cannot start with a number. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and cannot use periods (.
). For more information, see Keywords and reserved words. When a new graph space is created, the schema of the old graph space <old_graph_space_name>
will be cloned, including its parameters (the number of partitions and replicas, etc.), Tag, Edge type and native indexes. Note:1. If you name a space in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a space name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <old_graph_space_name>
The name of the graph space that already exists."},{"location":"3.ngql-guide/9.space-statements/1.create-space/#examples","title":"Examples","text":"# The following example creates a graph space with a specified VID type and the maximum length. Other fields still use the default values.\nnebula> CREATE SPACE IF NOT EXISTS my_space_1 (vid_type=FIXED_STRING(30));\n\n# The following example creates a graph space with a specified partition number, replica number, and VID type.\nnebula> CREATE SPACE IF NOT EXISTS my_space_2 (partition_num=15, replica_factor=1, vid_type=FIXED_STRING(30));\n\n# The following example creates a graph space with a specified partition number, replica number, and VID type, and adds a comment on it.\nnebula> CREATE SPACE IF NOT EXISTS my_space_3 (partition_num=15, replica_factor=1, vid_type=FIXED_STRING(30)) comment=\"Test the graph space\";\n\n# Clone a graph space.\nnebula> CREATE SPACE IF NOT EXISTS my_space_4 as my_space_3;\nnebula> SHOW CREATE SPACE my_space_4;\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| Space | Create Space |\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| \"my_space_4\" | \"CREATE SPACE `my_space_4` (partition_num = 15, replica_factor = 1, charset = utf8, collate = utf8_bin, vid_type = FIXED_STRING(30)) comment = '\u6d4b\u8bd5\u56fe\u7a7a\u95f4'\" |\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/9.space-statements/1.create-space/#implementation_of_the_operation","title":"Implementation of the operation","text":"Caution
Trying to use a newly created graph space may fail because the creation is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services. If the heartbeat interval is too short (i.e., less than 5 seconds), disconnection between peers may happen because of the misjudgment of machines in the distributed system.
On some large clusters, the partition distribution is possibly unbalanced because of the different startup times. You can run the following command to do a check of the machine distribution.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3, test:5\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 9 | \"basketballplayer:4, test:5\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 3 | \"basketballplayer:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n
To balance the request loads, use the following command.
nebula> BALANCE LEADER;\nnebula> SHOW HOSTS;\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+-----------+----------+--------------+--------------------------------+--------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 7 | \"basketballplayer:3, test:4\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 7 | \"basketballplayer:4, test:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 6 | \"basketballplayer:3, test:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n
"},{"location":"3.ngql-guide/9.space-statements/2.use-space/","title":"USE","text":"USE
specifies a graph space as the current working graph space for subsequent queries.
Running the USE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
USE <graph_space_name>;\n
"},{"location":"3.ngql-guide/9.space-statements/2.use-space/#examples","title":"Examples","text":"# The following example creates two sample spaces.\nnebula> CREATE SPACE IF NOT EXISTS space1 (vid_type=FIXED_STRING(30));\nnebula> CREATE SPACE IF NOT EXISTS space2 (vid_type=FIXED_STRING(30));\n\n# The following example specifies space1 as the current working graph space.\nnebula> USE space1;\n\n# The following example specifies space2 as the current working graph space. Hereafter, you cannot read any data from space1, because these vertices and edges being traversed have no relevance with space1.\nnebula> USE space2;\n
Caution
You cannot use two graph spaces in one statement.
Different from Fabric Cypher, graph spaces in NebulaGraph are fully isolated from each other. Making a graph space as the working graph space prevents you from accessing other spaces. The only way to traverse in a new graph space is to switch by the USE
statement. In Fabric Cypher, you can use two graph spaces in one statement (using the USE + CALL
syntax). But in NebulaGraph, you can only use one graph space in one statement.
SHOW SPACES
lists all the graph spaces in the NebulaGraph examples.
SHOW SPACES;\n
"},{"location":"3.ngql-guide/9.space-statements/3.show-spaces/#example","title":"Example","text":"nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"cba\" |\n| \"basketballplayer\" |\n+--------------------+\n
To create graph spaces, see CREATE SPACE.
"},{"location":"3.ngql-guide/9.space-statements/4.describe-space/","title":"DESCRIBE SPACE","text":"DESCRIBE SPACE
returns the information about the specified graph space.
You can use DESC
instead of DESCRIBE
for short.
DESC[RIBE] SPACE <graph_space_name>;\n
The DESCRIBE SPACE
statement is different from the SHOW SPACES
statement. For details about SHOW SPACES
, see SHOW SPACES.
nebula> DESCRIBE SPACE basketballplayer;\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n| ID | Name | Partition Number | Replica Factor | Charset | Collate | Vid Type | Comment |\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n| 1 | \"basketballplayer\" | 10 | 1 | \"utf8\" | \"utf8_bin\" | \"FIXED_STRING(32)\" | |\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n
"},{"location":"3.ngql-guide/9.space-statements/5.drop-space/","title":"DROP SPACE","text":"DROP SPACE
deletes the specified graph space and everything in it.
Note
DROP SPACE
can only delete the specified logic graph space while retain all the data on the hard disk by modifying the value of auto_remove_invalid_space
to false
in the Storage service configuration file. For more information, see Storage configuration.
Warning
After you execute DROP SPACE
, even if the snapshot contains data of the graph space, the data of the graph space cannot be recovered.
Only the God role can use the DROP SPACE
statement. For more information, see AUTHENTICATION.
DROP SPACE [IF EXISTS] <graph_space_name>;\n
You can use the IF EXISTS
keywords when dropping spaces. These keywords automatically detect if the related graph space exists. If it exists, it will be deleted. Otherwise, no graph space will be deleted.
Legacy version compatibility
In NebulaGraph versions earlier than 3.1.0, the DROP SPACE
statement does not remove all the files and directories from the disk by default.
Danger
BE CAUTIOUS about running the DROP SPACE
statement.
Q: Why is my disk space not freed after executing the 'DROP SPACE' statement and deleting a graph space?
A: For NebulaGraph version earlier than 3.1.0, DROP SPACE
can only delete the specified logic graph space and does not delete the files and directories on the disk. To delete the files and directories on the disk, manually delete the corresponding file path. The file path is located in <nebula_graph_install_path>/data/storage/nebula/<space_id>
. The <space_id>
can be viewed via DESCRIBE SPACE {space_name}
.
CLEAR SPACE
deletes the vertices and edges in a graph space, but does not delete the graph space itself and the schema information.
Note
It is recommended to execute SUBMIT JOB COMPACT immediately after executing the CLEAR SPACE
operation improve the query performance. Note that the COMPACT operation may affect query performance, and it is recommended to perform this operation during low business hours (e.g., early morning).
Only the God role has the permission to run CLEAR SPACE
.
CLEAR SPACE
with caution.CLEAR SPACE
is not an atomic operation. If an error occurs, re-run CLEAR SPACE
to avoid data remaining.storage_client_timeout_ms
parameter in the Graph Service configuration.CLEAR SPACE
, writing data into the graph space is not automatically prohibited. Such write operations can result in incomplete data clearing, and the residual data can be damaged.Note
The NebulaGraph Community Edition does not support blocking data writing while allowing CLEAR SPACE
.
CLEAR SPACE [IF EXISTS] <space_name>;\n
Parameter/Option Description IF EXISTS
Check whether the graph space to be cleared exists. If it exists, continue to clear it. If it does not exist, the execution finishes, and a message indicating that the execution succeeded is displayed. If IF EXISTS
is not set and the graph space does not exist, the CLEAR SPACE
statement fails to execute, and an error occurs. space_name
The name of the space to be cleared. Example:
CLEAR SPACE basketballplayer;\n
"},{"location":"3.ngql-guide/9.space-statements/6.clear-space/#data_reserved","title":"Data reserved","text":"CLEAR SPACE
does not delete the following data in a graph space:
The following example shows what CLEAR SPACE
deletes and reserves.
# Enter the graph space basketballplayer.\nnebula [(none)]> use basketballplayer;\nExecution succeeded\n\n# List tags and Edge types.\nnebula[basketballplayer]> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"team\" |\n+----------+\nGot 2 rows\n\nnebula[basketballplayer]> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\nGot 2 rows\n\n# Submit a job to make statistics of the graph space.\nnebula[basketballplayer]> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 4 |\n+------------+\nGot 1 rows\n\n# Check the statistics.\nnebula[basketballplayer]> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 51 |\n| \"Tag\" | \"team\" | 30 |\n| \"Edge\" | \"follow\" | 81 |\n| \"Edge\" | \"serve\" | 152 |\n| \"Space\" | \"vertices\" | 81 |\n| \"Space\" | \"edges\" | 233 |\n+---------+------------+-------+\nGot 6 rows\n\n# List tag indexes.\nnebula[basketballplayer]> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\nGot 2 rows\n\n# ----------------------- Dividing line for CLEAR SPACE -----------------------\n# Run CLEAR SPACE to clear the graph space basketballplayer.\nnebula[basketballplayer]> CLEAR SPACE basketballplayer;\nExecution succeeded\n\n# Update the statistics.\nnebula[basketballplayer]> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 5 |\n+------------+\nGot 1 rows\n\n# Check the statistics. The tags and edge types still exist, but all the vertices and edges are gone.\nnebula[basketballplayer]> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 0 |\n| \"Tag\" | \"team\" | 0 |\n| \"Edge\" | \"follow\" | 0 |\n| \"Edge\" | \"serve\" | 0 |\n| \"Space\" | \"vertices\" | 0 |\n| \"Space\" | \"edges\" | 0 |\n+---------+------------+-------+\nGot 6 rows\n\n# Try to list the tag indexes. They still exist.\nnebula[basketballplayer]> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\nGot 2 rows (time spent 523/978 us)\n
"},{"location":"4.deployment-and-installation/1.resource-preparations/","title":"Prepare resources for compiling, installing, and running NebulaGraph","text":"This topic describes the requirements and suggestions for compiling and installing NebulaGraph, as well as how to estimate the resource you need to reserve for running a NebulaGraph cluster.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#about_storage_devices","title":"About storage devices","text":"NebulaGraph is designed and implemented for NVMe SSD. All default parameters are optimized for the SSD devices and require extremely high IOPS and low latency.
Starting with 3.0.2, you can run containerized NebulaGraph databases on Docker Desktop for ARM macOS or on ARM Linux servers.
Caution
We do not recommend you deploy NebulaGraph on Docker Desktop for Windows due to its subpar performance. For details, see #12401.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#requirements_for_compiling_the_source_code","title":"Requirements for compiling the source code","text":""},{"location":"4.deployment-and-installation/1.resource-preparations/#hardware_requirements_for_compiling_nebulagraph","title":"Hardware requirements for compiling NebulaGraph","text":"Item Requirement CPU architecture x86_64 Memory 4 GB Disk 10 GB, SSD"},{"location":"4.deployment-and-installation/1.resource-preparations/#supported_operating_systems_for_compiling_nebulagraph","title":"Supported operating systems for compiling NebulaGraph","text":"For now, we can only compile NebulaGraph in the Linux system. We recommend that you use any Linux system with kernel version 4.15
or above.
Note
To install NebulaGraph on Linux systems with kernel version lower than required, use RPM/DEB packages or TAR files.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#software_requirements_for_compiling_nebulagraph","title":"Software requirements for compiling NebulaGraph","text":"You must have the correct version of the software listed below to compile NebulaGraph. If they are not as required or you are not sure, follow the steps in Prepare software for compiling NebulaGraph to get them ready.
Software Version Note glibc 2.17 or above You can runldd --version
to check the glibc version. make Any stable version - m4 Any stable version - git Any stable version - wget Any stable version - unzip Any stable version - xz Any stable version - readline-devel Any stable version - ncurses-devel Any stable version - zlib-devel Any stable version - g++ 8.5.0 or above You can run gcc -v
to check the gcc version. cmake 3.14.0 or above You can run cmake --version
to check the cmake version. curl Any stable version - redhat-lsb-core Any stable version - libstdc++-static Any stable version Only needed in CentOS 8+, RedHat 8+, and Fedora systems. libasan Any stable version Only needed in CentOS 8+, RedHat 8+, and Fedora systems. bzip2 Any stable version - Other third-party software will be automatically downloaded and installed to the build
directory at the configure (cmake) stage.
If part of the dependencies are missing or the versions does not meet the requirements, manually install them with the following steps. You can skip unnecessary dependencies or steps according to your needs.
Install dependencies.
$ yum update\n$ yum install -y make \\\n m4 \\\n git \\\n wget \\\n unzip \\\n xz \\\n readline-devel \\\n ncurses-devel \\\n zlib-devel \\\n gcc \\\n gcc-c++ \\\n cmake \\\n curl \\\n redhat-lsb-core \\\n bzip2\n // For CentOS 8+, RedHat 8+, and Fedora, install libstdc++-static and libasan as well\n$ yum install -y libstdc++-static libasan\n
$ apt-get update\n$ apt-get install -y make \\\n m4 \\\n git \\\n wget \\\n unzip \\\n xz-utils \\\n curl \\\n lsb-core \\\n build-essential \\\n libreadline-dev \\\n ncurses-dev \\\n cmake \\\n bzip2\n
Check if the GCC and cmake on your host are in the right version. See Software requirements for compiling NebulaGraph for the required versions.
$ g++ --version\n$ cmake --version\n
If your GCC and CMake are in the right versions, then you are all set and you can ignore the subsequent steps. If they are not, select and perform the needed steps as follows.
If the CMake version is incorrect, visit the CMake official website to install the required version.
If the G++ version is incorrect, visit the G++ official website or follow the instructions below to to install the required version.
For CentOS users, run:
yum install centos-release-scl\nyum install devtoolset-11\nscl enable devtoolset-11 'bash'\n
For Ubuntu users, run:
add-apt-repository ppa:ubuntu-toolchain-r/test\napt install gcc-11 g++-11\n
For now, we can only install NebulaGraph in the Linux system. To install NebulaGraph in a test environment, we recommend that you use any Linux system with kernel version 3.9
or above.
For example, for a single-machine test environment, you can deploy 1 metad, 1 storaged, and 1 graphd processes in the machine.
For a more common test environment, such as a cluster of 3 machines (named as A, B, and C), you can deploy NebulaGraph as follows:
Machine name Number of metad Number of storaged Number of graphd A 1 1 1 B None 1 1 C None 1 1"},{"location":"4.deployment-and-installation/1.resource-preparations/#requirements_and_suggestions_for_installing_nebulagraph_in_production_environments","title":"Requirements and suggestions for installing NebulaGraph in production environments","text":""},{"location":"4.deployment-and-installation/1.resource-preparations/#hardware_requirements_for_production_environments","title":"Hardware requirements for production environments","text":"Item Requirement CPU architecture x86_64 Number of CPU core 48 Memory 256 GB Disk 2 * 1.6 TB, NVMe SSD"},{"location":"4.deployment-and-installation/1.resource-preparations/#supported_operating_systems_for_production_environments","title":"Supported operating systems for production environments","text":"For now, we can only install NebulaGraph in the Linux system. To install NebulaGraph in a production environment, we recommend that you use any Linux system with kernel version 3.9 or above.
Users can adjust some of the kernel parameters to better accommodate the need for running NebulaGraph. For more information, see kernel configuration.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#suggested_service_architecture_for_production_environments","title":"Suggested service architecture for production environments","text":"Danger
DO NOT deploy a single cluster across IDCs (The Enterprise Edtion supports data synchronization between clusters across IDCs).
Process Suggested number metad (the metadata service process) 3 storaged (the storage service process) 3 or more graphd (the query engine service process) 3 or moreEach metad process automatically creates and maintains a replica of the metadata. Usually, you need to deploy three metad processes and only three.
The number of storaged processes does not affect the number of graph space replicas.
Users can deploy multiple processes on a single machine. For example, on a cluster of 5 machines (named as A, B, C, D, and E), you can deploy NebulaGraph as follows:
Machine name Number of metad Number of storaged Number of graphd A 1 1 1 B 1 1 1 C 1 1 1 D None 1 1 E None 1 1"},{"location":"4.deployment-and-installation/1.resource-preparations/#capacity_requirements_for_running_a_nebulagraph_cluster","title":"Capacity requirements for running a NebulaGraph cluster","text":"Users can estimate the memory, disk space, and partition number needed for a NebulaGraph cluster of 3 replicas as follows.
Resource Unit How to estimate Description Disk space for a cluster Bytesthe_sum_of_edge_number_and_vertex_number
* average_bytes_of_properties
* 7.5 * 120% For more information, see Edge partitioning and storage amplification. Memory for a cluster Bytes [the_sum_of_edge_number_and_vertex_number
* 16 + the_number_of_RocksDB_instances
* (write_buffer_size
* max_write_buffer_number
) + rocksdb_block_cache
] * 120% write_buffer_size
and max_write_buffer_number
are RocksDB parameters. For more information, see MemTable. For details about rocksdb_block_cache
, see Memory usage in RocksDB. Number of partitions for a graph space - the_number_of_disks_in_the_cluster
* disk_partition_num_multiplier
disk_partition_num_multiplier
is an integer between 2 and 20 (both including). Its value depends on the disk performance. Use 20 for SSD and 2 for HDD. Answer: On one hand, the data in one single replica takes up about 2.5 times more space than that of the original data file (csv) according to test values. On the other hand, indexes take up additional space. Each indexed vertex or edge takes up 16 bytes of memory. The hard disk space occupied by the index can be empirically estimated as the total number of indexed vertices or edges * 50 bytes.
Answer: The extra 20% is for buffer.
Question 3: How to get the number of RocksDB instances?
Answer: Each graph space corresponds to one RocksDB instance and each directory in the --data_path
item in the etc/nebula-storaged.conf
file corresponds to one RocksDB instance. That is, the number of RocksDB instances = the number of directories * the number of graph spaces.
Note
Users can decrease the memory size occupied by the bloom filter by adding --enable_partitioned_index_filter=true
in etc/nebula-storaged.conf
. But it may decrease the read performance in some random-seek cases.
Caution
Each RocksDB instance takes up about 70M of disk space even when no data has been written yet. One partition corresponds to one RocksDB instance, and when the partition setting is very large, for example, 100, the graph space takes up a lot of disk space after it is created.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/","title":"Uninstall NebulaGraph","text":"This topic describes how to uninstall NebulaGraph.
Caution
Before re-installing NebulaGraph on a machine, follow this topic to completely uninstall the old NebulaGraph, in case the remaining data interferes with the new services, including inconsistencies between Meta services.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#prerequisite","title":"Prerequisite","text":"The NebulaGraph services should be stopped before the uninstallation. For more information, see Manage NebulaGraph services.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#step_1_delete_data_files_of_the_storage_and_meta_services","title":"Step 1: Delete data files of the Storage and Meta Services","text":"If you have modified the data_path
in the configuration files for the Meta Service and Storage Service, the directories where NebulaGraph stores data may not be in the installation path of NebulaGraph. Check the configuration files to confirm the data paths, and then manually delete the directories to clear all data.
Note
For a NebulaGraph cluster, delete the data files of all Storage and Meta servers.
Check the Storage Service disk settings. For example:
########## Disk ##########\n# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/\n# One path per Rocksdb instance.\n--data_path=/nebula/data/storage\n
Check the Metad Service configurations and find the corresponding metadata directories.
Delete the data and the directories found in step 2.
Note
Delete all installation directories, including the cluster.id
file in them.
The default installation path is /usr/local/nebula
, which is specified by --prefix
while installing NebulaGraph.
Find the installation directories of NebulaGraph, and delete them all.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#uninstall_nebulagraph_deployed_with_rpm_packages","title":"Uninstall NebulaGraph deployed with RPM packages","text":"Run the following command to get the NebulaGraph version.
$ rpm -qa | grep \"nebula\"\n
The return message is as follows.
nebula-graph-master-1.x86_64\n
Run the following command to uninstall NebulaGraph.
sudo rpm -e <nebula_version>\n
For example:
sudo rpm -e nebula-graph-master-1.x86_64\n
Delete the installation directories.
Run the following command to get the NebulaGraph version.
$ dpkg -l | grep \"nebula\"\n
The return message is as follows.
ii nebula-graph master amd64 NebulaGraph Package built using CMake\n
Run the following command to uninstall NebulaGraph.
sudo dpkg -r <nebula_version>\n
For example:
sudo dpkg -r nebula-graph\n
Delete the installation directories.
In the nebula-docker-compose
directory, run the following command to stop the NebulaGraph services.
docker-compose down -v\n
Delete the nebula-docker-compose
directory.
This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"4.deployment-and-installation/connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
NebulaGraph supports managing services with scripts.
"},{"location":"4.deployment-and-installation/manage-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"4.deployment-and-installation/manage-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"4.deployment-and-installation/manage-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"4.deployment-and-installation/manage-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"4.deployment-and-installation/manage-storage-host/","title":"Manage Storage hosts","text":"Starting from NebulaGraph 3.0.0, setting Storage hosts in the configuration files only registers the hosts on the Meta side, but does not add them into the cluster. You must run the ADD HOSTS
statement to add the Storage hosts.
Note
NebulaGraph Cloud clusters add Storage hosts automatically. Cloud users do not need to manually run ADD HOSTS
.
Add the Storage hosts to a NebulaGraph cluster.
nebula> ADD HOSTS <ip>:<port> [,<ip>:<port> ...];\nnebula> ADD HOSTS \"<hostname>\":<port> [,\"<hostname>\":<port> ...];\n
Note
SHOW HOSTS
to check whether the host is online.127.0.0.1:9779
.ADD HOSTS \"foo-bar\":9779
.Delete the Storage hosts from cluster.
Note
You can not delete an in-use Storage host directly. Delete the associated graph space before deleting the Storage host.
nebula> DROP HOSTS <ip>:<port> [,<ip>:<port> ...];\nnebula> DROP HOSTS \"<hostname>\":<port> [,\"<hostname>\":<port> ...];\n
"},{"location":"4.deployment-and-installation/manage-storage-host/#view_storage_hosts","title":"View Storage hosts","text":"View the Storage hosts in the cluster.
nebula> SHOW HOSTS STORAGE;\n+-------------+------+----------+-----------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-------------+------+----------+-----------+--------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n+-------------+------+----------+-----------+--------------+---------+\n
"},{"location":"4.deployment-and-installation/standalone-deployment/","title":"Standalone NebulaGraph","text":"Standalone NebulaGraph merges the Meta, Storage, and Graph services into a single process deployed on a single machine. This topic introduces scenarios, deployment steps, etc. of standalone NebulaGraph.
Danger
Do not use standalone NebulaGraph in production environments.
"},{"location":"4.deployment-and-installation/standalone-deployment/#background","title":"Background","text":"The traditional NebulaGraph consists of three services, each service having executable binary files and the corresponding process. Processes communicate with each other by RPC. In standalone NebulaGraph, the three processes corresponding to the three services are combined into one process. For more information about NebulaGraph, see Architecture overview.
"},{"location":"4.deployment-and-installation/standalone-deployment/#scenarios","title":"Scenarios","text":"Small data sizes and low availability requirements. For example, test environments that are limited by the number of machines, scenarios that are only used to verify functionality.
"},{"location":"4.deployment-and-installation/standalone-deployment/#limitations","title":"Limitations","text":"For information about the resource requirements for standalone NebulaGraph, see Software requirements for compiling NebulaGraph.
"},{"location":"4.deployment-and-installation/standalone-deployment/#steps","title":"Steps","text":"Currently, you can only install standalone NebulaGraph with the source code. The steps are similar to those of the multi-process NebulaGraph. You only need to modify the step Generate Makefile with CMake by adding -DENABLE_STANDALONE_VERSION=on
to the command. For example:
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DENABLE_STANDALONE_VERSION=on -DCMAKE_BUILD_TYPE=Release .. \n
For more information about installation details, see Install NebulaGraph by compiling the source code.
After installing standalone NebulaGraph, see the topic connect to Service to connect to NebulaGraph databases.
"},{"location":"4.deployment-and-installation/standalone-deployment/#configuration_file","title":"Configuration file","text":"The path to the configuration file for standalone NebulaGraph is /usr/local/nebula/etc
by default.
You can run sudo cat nebula-standalone.conf.default
to see the file content. The parameters and the corresponding descriptions in the file are generally the same as the configurations for multi-process NebulaGraph except for the following parameters.
meta_port
9559
The port number of the Meta service. storage_port
9779
The port number of the Storage Service. meta_data_path
data/meta
The path to Meta data. You can run commands to check configurable parameters and the corresponding descriptions. For details, see Configurations.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/","title":"Install NebulaGraph by compiling the source code","text":"Installing NebulaGraph from the source code allows you to customize the compiling and installation settings and test the latest features.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#prerequisites","title":"Prerequisites","text":"Users have to prepare correct resources described in Prepare resources for compiling, installing, and running NebulaGraph.
Note
Compilation of NebulaGraph offline is not currently supported.
Use Git to clone the source code of NebulaGraph to the host.
[Recommended] To install NebulaGraph master, run the following command.
$ git clone --branch release-3.6 https://github.com/vesoft-inc/nebula.git\n
To install the latest developing release, run the following command to clone the source code from the master branch.
$ git clone https://github.com/vesoft-inc/nebula.git\n
Go to the nebula/third-party
directory, and run the install-third-party.sh
script to install the third-party libraries.
$ cd nebula/third-party\n$ ./install-third-party.sh\n
Go back to the nebula
directory, create a directory named build
, and enter the directory.
$ cd ..\n$ mkdir build && cd build\n
Generate Makefile with CMake.
Note
The installation path is /usr/local/nebula
by default. To customize it, add the -DCMAKE_INSTALL_PREFIX=<installation_path>
CMake variable in the following command.
For more information about CMake variables, see CMake variables.
$ cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DCMAKE_BUILD_TYPE=Release ..\n
Compile NebulaGraph.
Note
Check Prepare resources for compiling, installing, and running NebulaGraph.
To speed up the compiling, use the -j
option to set a concurrent number N
. It should be \\(\\min(\\text{CPU core number},\\frac{\\text{the memory size(GB)}}{2})\\).
$ make -j{N} # E.g., make -j2\n
Install NebulaGraph.
$ sudo make install\n
Note
The configuration files in the etc/
directory (/usr/local/nebula/etc
by default) are references. Users can create their own configuration files accordingly. If you want to use the scripts in the script
directory to start, stop, restart, and kill the service, and check the service status, the configuration files have to be named as nebula-graph.conf
, nebula-metad.conf
, and nebula-storaged.conf
.
The source code of the master branch changes frequently. If the corresponding NebulaGraph release is installed, update it in the following steps.
In the nebula
directory, run git pull upstream master
to update the source code.
In the nebula/build
directory, run make -j{N}
and make install
again.
Manage NebulaGraph services
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#cmake_variables","title":"CMake variables","text":""},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#usage_of_cmake_variables","title":"Usage of CMake variables","text":"$ cmake -D<variable>=<value> ...\n
The following CMake variables can be used at the configure (cmake) stage to adjust the compiling settings.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#cmake_install_prefix","title":"CMAKE_INSTALL_PREFIX","text":"CMAKE_INSTALL_PREFIX
specifies the path where the service modules, scripts, configuration files are installed. The default path is /usr/local/nebula
.
ENABLE_WERROR
is ON
by default and it makes all warnings into errors. You can set it to OFF
if needed.
ENABLE_TESTING
is ON
by default and unit tests are built with the NebulaGraph services. If you just need the service modules, set it to OFF
.
ENABLE_ASAN
is OFF
by default and the building of ASan (AddressSanitizer), a memory error detector, is disabled. To enable it, set ENABLE_ASAN
to ON
. This variable is intended for NebulaGraph developers.
NebulaGraph supports the following building types of MAKE_BUILD_TYPE
:
Debug
The default value of CMAKE_BUILD_TYPE
. It indicates building NebulaGraph with the debug info but not the optimization options.
Release
It indicates building NebulaGraph with the optimization options but not the debug info.
RelWithDebInfo
It indicates building NebulaGraph with the optimization options and the debug info.
MinSizeRel
It indicates building NebulaGraph with the optimization options for controlling the code size but not the debug info.
ENABLE_INCLUDE_WHAT_YOU_USE
is OFF
by default. When set to ON
and include-what-you-use is installed on the system, the system reports redundant headers contained in the project source code during makefile generation.
Specifies the program linker on the system. The available values are:
bfd
, the default value, indicates that ld.bfd is applied as the linker.lld
, indicates that ld.lld, if installed on the system, is applied as the linker.gold
, indicates that ld.gold, if installed on the system, is applied as the linker.Usually, CMake locates and uses a C/C++ compiler installed in the host automatically. But if your compiler is not installed at the standard path, or if you want to use a different one, run the command as follows to specify the installation path of the target compiler:
$ cmake -DCMAKE_C_COMPILER=<path_to_gcc/bin/gcc> -DCMAKE_CXX_COMPILER=<path_to_gcc/bin/g++> ..\n$ cmake -DCMAKE_C_COMPILER=<path_to_clang/bin/clang> -DCMAKE_CXX_COMPILER=<path_to_clang/bin/clang++> ..\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#enable_ccache","title":"ENABLE_CCACHE","text":"ENABLE_CCACHE
is ON
by default and Ccache (compiler cache) is used to speed up the compiling of NebulaGraph.
To disable ccache
, setting ENABLE_CCACHE
to OFF
is not enough. On some platforms, the ccache
installation hooks up or precedes the compiler. In such a case, you have to set an environment variable export CCACHE_DISABLE=true
or add a line disable=true
in ~/.ccache/ccache.conf
as well. For more information, see the ccache official documentation.
NEBULA_THIRDPARTY_ROOT
specifies the path where the third party software is installed. By default it is /opt/vesoft/third-party
.
If the compiling fails, we suggest you:
Check whether the operating system release meets the requirements and whether the memory and hard disk space are sufficient.
Check whether the third-party is installed correctly.
Use make -j1
to reduce the compiling concurrency.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/2.install-nebula-graph-by-rpm-or-deb/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
Using Docker Compose can quickly deploy NebulaGraph services based on the prepared configuration file. It is only recommended to use this method when testing functions of NebulaGraph.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#prerequisites","title":"Prerequisites","text":"You have installed the following applications on your host.
Application Recommended version Official installation reference Docker Latest Install Docker Engine Docker Compose Latest Install Docker Compose Git Latest Download Gitnebula-docker-compose/data
directory.Clone the 3.6.0
branch of the nebula-docker-compose
repository to your host with Git.
Danger
The master
branch contains the untested code for the latest NebulaGraph development release. DO NOT use this release in a production environment.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-docker-compose.git\n
Note
The x.y
version of Docker Compose aligns to the x.y
version of NebulaGraph. For the NebulaGraph z
version, Docker Compose does not publish the corresponding z
version, but pulls the z
version of the NebulaGraph image.
Go to the nebula-docker-compose
directory.
$ cd nebula-docker-compose/\n
Run the following command to start all the NebulaGraph services.
Note
[nebula-docker-compose]$ docker-compose up -d\nCreating nebula-docker-compose_metad0_1 ... done\nCreating nebula-docker-compose_metad2_1 ... done\nCreating nebula-docker-compose_metad1_1 ... done\nCreating nebula-docker-compose_graphd2_1 ... done\nCreating nebula-docker-compose_graphd_1 ... done\nCreating nebula-docker-compose_graphd1_1 ... done\nCreating nebula-docker-compose_storaged0_1 ... done\nCreating nebula-docker-compose_storaged2_1 ... done\nCreating nebula-docker-compose_storaged1_1 ... done\n
Compatibility
Starting from NebulaGraph version 3.1.0, nebula-docker-compose automatically starts a NebulaGraph Console docker container and adds the storage host to the cluster (i.e. ADD HOSTS
command).
Note
For more information of the preceding services, see NebulaGraph architecture.
There are two ways to connect to NebulaGraph:
9669
in the container's configuration file, you can connect directly through the default port. For details, see Connect to NebulaGraph.Run the following command to view the name of NebulaGraph Console docker container.
$ docker-compose ps\n Name Command State Ports \n-----------------------------------------------------------------------------------------------------------------------------------------------------------\nnebula-docker-compose_console_1 sh -c for i in `seq 1 60`; ... Up \nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up (healthy) 0.0.0.0:32847->15669/tcp,:::32847->15669/tcp, 19669/tcp, \n 0.0.0.0:32846->19670/tcp,:::32846->19670/tcp, \n 0.0.0.0:32849->5669/tcp,:::32849->5669/tcp, 9669/tcp \n......\n
Note
nebula-docker-compose_console_1
and nebula-docker-compose_graphd1_1
are the container names of NebulaGraph Console and Graph Service respectively.
Run the following command to enter the NebulaGraph Console docker container.
docker exec -it nebula-docker-compose_console_1 /bin/sh\n/ #\n
Connect to NebulaGraph with NebulaGraph Console.
/ # ./usr/local/bin/nebula-console -u <user_name> -p <password> --address=graphd --port=9669\n
Note
By default, the authentication is off, you can only log in with an existing username (the default is root
) and any password. To turn it on, see Enable authentication.
Run the following commands to view the cluster state.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n
Run exit
twice to switch back to your terminal (shell).
Run docker-compose ps
to list all the services of NebulaGraph and their status and ports.
Note
NebulaGraph provides services to the clients through port 9669
by default. To use other ports, modify the docker-compose.yaml
file in the nebula-docker-compose
directory and restart the NebulaGraph services.
$ docker-compose ps\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49174->19669/tcp,:::49174->19669/tcp, 0.0.0.0:49171->19670/tcp,:::49171->19670/tcp, 0.0.0.0:49177->9669/tcp,:::49177->9669/tcp\nnebula-docker-compose_graphd2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49175->19669/tcp,:::49175->19669/tcp, 0.0.0.0:49172->19670/tcp,:::49172->19670/tcp, 0.0.0.0:49178->9669/tcp,:::49178->9669/tcp\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49180->19669/tcp,:::49180->19669/tcp, 0.0.0.0:49179->19670/tcp,:::49179->19670/tcp, 0.0.0.0:9669->9669/tcp,:::9669->9669/tcp\nnebula-docker-compose_metad0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49157->19559/tcp,:::49157->19559/tcp, 0.0.0.0:49154->19560/tcp,:::49154->19560/tcp, 0.0.0.0:49160->9559/tcp,:::49160->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49156->19559/tcp,:::49156->19559/tcp, 0.0.0.0:49153->19560/tcp,:::49153->19560/tcp, 0.0.0.0:49159->9559/tcp,:::49159->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49158->19559/tcp,:::49158->19559/tcp, 0.0.0.0:49155->19560/tcp,:::49155->19560/tcp, 0.0.0.0:49161->9559/tcp,:::49161->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49166->19779/tcp,:::49166->19779/tcp, 0.0.0.0:49163->19780/tcp,:::49163->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49169->9779/tcp,:::49169->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49165->19779/tcp,:::49165->19779/tcp, 0.0.0.0:49162->19780/tcp,:::49162->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49168->9779/tcp,:::49168->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49167->19779/tcp,:::49167->19779/tcp, 0.0.0.0:49164->19780/tcp,:::49164->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49170->9779/tcp,:::49170->9779/tcp, 9780/tcp\n
If the service is abnormal, you can first confirm the abnormal container name (such as nebula-docker-compose_graphd2_1
) and then log in to the container and troubleshoot.
$ docker exec -it nebula-docker-compose_graphd2_1 bash\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#check_the_service_data_and_logs","title":"Check the service data and logs","text":"All the data and logs of NebulaGraph are stored persistently in the nebula-docker-compose/data
and nebula-docker-compose/logs
directories.
The structure of the directories is as follows:
nebula-docker-compose/\n |-- docker-compose.yaml\n \u251c\u2500\u2500 data\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta2\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 storage2\n \u2514\u2500\u2500 logs\n \u251c\u2500\u2500 graph\n \u251c\u2500\u2500 graph1\n \u251c\u2500\u2500 graph2\n \u251c\u2500\u2500 meta0\n \u251c\u2500\u2500 meta1\n \u251c\u2500\u2500 meta2\n \u251c\u2500\u2500 storage0\n \u251c\u2500\u2500 storage1\n \u2514\u2500\u2500 storage2\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#modify_configurations","title":"Modify configurations","text":"The configuration file of Docker Compose is nebula-docker-compose/docker-compose.yaml
. To make the new configuration take effect, modify the configuration in this file and restart the service.
The configurations in the docker-compose.yaml
file overwrite the configurations in the configuration file (/usr/local/nebula/etc
) of the containered NebulaGraph service. Therefore, you can modify the configurations in the docker-compose.yaml
file to customize the configurations of the NebulaGraph service.
For more instructions, see Configurations.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#restart_nebulagraph_services","title":"Restart NebulaGraph services","text":"To restart all the NebulaGraph services, run the following command:
$ docker-compose restart\nRestarting nebula-docker-compose_console_1 ... done\nRestarting nebula-docker-compose_graphd_1 ... done\nRestarting nebula-docker-compose_graphd1_1 ... done\nRestarting nebula-docker-compose_graphd2_1 ... done\nRestarting nebula-docker-compose_storaged1_1 ... done\nRestarting nebula-docker-compose-storaged0_1 ... done\nRestarting nebula-docker-compose_storaged2_1 ... done\nRestarting nebula-docker-compose_metad1_1 ... done\nRestarting nebula-docker-compose_metad2_1 ... done\nRestarting nebula-docker-compose_metad0_1 ... done\n
To restart multiple services, such as graphd and storaged0, run the following command:
$ docker-compose restart graphd storaged0\nRestarting nebula-docker-compose_graphd_1 ... done\nRestarting nebula-docker-compose_storaged0_1 ... done\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#stop_and_remove_nebulagraph_services","title":"Stop and remove NebulaGraph services","text":"You can stop and remove all the NebulaGraph services by running the following command:
Danger
This command stops and removes all the containers of the NebulaGraph services and the related network. If you define volumes in the docker-compose.yaml
, the related data are retained.
The command docker-compose down -v
removes all the local data. Try this command if you are using the nightly release and having some compatibility issues.
$ docker-compose down\n
The following information indicates you have successfully stopped the NebulaGraph services:
Stopping nebula-docker-compose_console_1 ... done\nStopping nebula-docker-compose_graphd1_1 ... done\nStopping nebula-docker-compose_graphd_1 ... done\nStopping nebula-docker-compose_graphd2_1 ... done\nStopping nebula-docker-compose_storaged1_1 ... done\nStopping nebula-docker-compose_storaged0_1 ... done\nStopping nebula-docker-compose_storaged2_1 ... done\nStopping nebula-docker-compose_metad2_1 ... done\nStopping nebula-docker-compose_metad0_1 ... done\nStopping nebula-docker-compose_metad1_1 ... done\nRemoving nebula-docker-compose_console_1 ... done\nRemoving nebula-docker-compose_graphd1_1 ... done\nRemoving nebula-docker-compose_graphd_1 ... done\nRemoving nebula-docker-compose_graphd2_1 ... done\nRemoving nebula-docker-compose_storaged1_1 ... done\nRemoving nebula-docker-compose_storaged0_1 ... done\nRemoving nebula-docker-compose_storaged2_1 ... done\nRemoving nebula-docker-compose_metad2_1 ... done\nRemoving nebula-docker-compose_metad0_1 ... done\nRemoving nebula-docker-compose_metad1_1 ... done\nRemoving network nebula-docker-compose_nebula-net\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#faq","title":"FAQ","text":""},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#how_to_fix_the_docker_mapping_to_external_ports","title":"How to fix the docker mapping to external ports?","text":"To set the ports
of corresponding services as fixed mapping, modify the docker-compose.yaml
in the nebula-docker-compose
directory. For example:
graphd:\n image: vesoft/nebula-graphd:release-3.6\n ...\n ports:\n - 9669:9669\n - 19669\n - 19670\n
9669:9669
indicates the internal port 9669 is uniformly mapped to external ports, while 19669
indicates the internal port 19669 is randomly mapped to external ports.
In the nebula-docker-compose/docker-compose.yaml
file, change all the image
values to the required image version.
In the nebula-docker-compose
directory, run docker-compose pull
to update the images of the Graph Service, Storage Service, Meta Service, and NebulaGraph Console.
Run docker-compose up -d
to start the NebulaGraph services again.
After connecting to NebulaGraph with NebulaGraph Console, run SHOW HOSTS GRAPH
, SHOW HOSTS STORAGE
, or SHOW HOSTS META
to check the version of the responding service respectively.
ERROR: toomanyrequests
when docker-compose pull
","text":"You may meet the following error.
ERROR: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
.
You have met the rate limit of Docker Hub. Learn more on Understanding Docker Hub Rate Limiting.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#how_to_update_the_nebulagraph_console_client","title":"How to update the NebulaGraph Console client?","text":"The command docker-compose pull
updates both the NebulaGraph services and the NebulaGraph Console.
offline
status?","text":"The activation script for storaged containers in Docker Compose may fail to run in rare cases. You can connect to NebulaGraph with NebulaGraph Console or NebulaGraph Studio and then manually run the ADD HOSTS
command to activate them by adding the storaged containers to the cluster. An example of the command is as follows:
nebula> ADD HOSTS \"storaged0\":9779,\"storaged1\":9779,\"storaged2\":9779\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#related_documents","title":"Related documents","text":"You can install NebulaGraph by downloading the tar.gz file.
Note
Download the NebulaGraph tar.gz file using the following address.
Before downloading, you need to replace <release_version>
with the version you want to download.
//Centos 7\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.tar.gz.sha256sum.txt\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.tar.gz.sha256sum.txt\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.tar.gz.sha256sum.txt\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.tar.gz.sha256sum.txt\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.tar.gz.sha256sum.txt\n
For example, to download the NebulaGraph release-3.6 tar.gz file for CentOS 7.5
, run the following command:
wget https://oss-cdn.nebula-graph.com.cn/package/master/nebula-graph-master.el7.x86_64.tar.gz\n
Decompress the tar.gz file to the NebulaGraph installation directory.
tar -xvzf <tar.gz_file_name> -C <install_path>\n
tar.gz_file_name
specifies the name of the tar.gz file.install_path
specifies the installation path.For example:
tar -xvzf nebula-graph-master.el7.x86_64.tar.gz -C /home/joe/nebula/install\n
Modify the name of the configuration file.
Enter the decompressed directory, rename the files nebula-graphd.conf.default
, nebula-metad.conf.default
, and nebula-storaged.conf.default
in the subdirectory etc
, and delete .default
to apply the default configuration of NebulaGraph.
Note
To modify the configuration, see Configurations.
So far, you have installed NebulaGraph successfully.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/4.install-nebula-graph-from-tar/#next_to_do","title":"Next to do","text":"Manage NebulaGraph services
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/6.deploy-nebula-graph-with-peripherals/","title":"Install NebulaGraph with ecosystem tools","text":"You can install the NebulaGraph Community Edition with the following ecosystem tools:
NebulaGraph's source code is written in C++. Compiling NebulaGraph requires certain dependencies which might conflict with host system dependencies, potentially causing compilation failures. Docker offers a solution to this. NebulaGraph provides a Docker image containing the complete compilation environment, ensuring an efficient build process and avoiding host OS conflicts. This guide outlines the steps to compile NebulaGraph using Docker.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/7.compile-using-docker/#prerequisites","title":"Prerequisites","text":"Before you begin:
Docker: Ensure Docker is installed on your system.
Clone NebulaGraph's Source Code: Clone the repository locally using:
git clone --branch release-3.6 https://github.com/vesoft-inc/nebula.git\n
This clones the NebulaGraph source code to a subdirectory named nebula
.
Pull the NebulaGraph compilation image.
docker pull vesoft/nebula-dev:ubuntu2004\n
Here, we use the official NebulaGraph compilation image, ubuntu2004
. For different versions, see nebula-dev-docker.
Start the compilation container.
docker run -ti \\\n --security-opt seccomp=unconfined \\\n -v \"$PWD\":/home \\\n -w /home \\\n --name nebula_dev \\\n vesoft/nebula-dev:ubuntu2004 \\\n bash\n
--security-opt seccomp=unconfined
: Disables the seccomp security mechanism to avoid compilation errors.-v \"$PWD\":/home
: Mounts the local path of the NebulaGraph code to the container's /home
directory.-w /home
: Sets the container's working directory to /home
. Any command run inside the container will use this directory as the current directory.--name nebula_dev
: Assigns a name to the container, making it easier to manage and operate.vesoft/nebula-dev:ubuntu2004
: Uses the ubuntu2004
version of the vesoft/nebula-dev
compilation image.bash
: Executes the bash
command inside the container, entering the container's interactive terminal.After executing this command, you'll enter an interactive terminal inside the container. To re-enter the container, use docker exec -ti nebula_dev bash
.
Compile NebulaGraph inside the container.
Enter the NebulaGraph source code directory.
cd nebula\n
Create a build directory and enter it.
mkdir build && cd build\n
Use CMake to generate the Makefile.
cmake -DCMAKE_CXX_COMPILER=$TOOLSET_CLANG_DIR/bin/g++ -DCMAKE_C_COMPILER=$TOOLSET_CLANG_DIR/bin/gcc -DENABLE_WERROR=OFF -DCMAKE_BUILD_TYPE=Debug -DENABLE_TESTING=OFF ..\n
For more on CMake, see CMake Parameters. Compile NebulaGraph.
# The -j parameter specifies the number of threads to use.\n# If you have a multi-core CPU, you can use more threads to speed up compilation.\nmake -j2\n
Compilation might take some time based on your system performance.
Install the Executables and Libraries.
Post successful compilation, NebulaGraph's binaries and libraries are located in /home/nebula/build
. Install them to /usr/local/nebula
:
make install\n
Once completed, NebulaGraph is compiled and installed in the host directory /usr/local/nebula
.
For now, NebulaGraph does not provide an official deployment tool. Users can deploy a NebulaGraph cluster with RPM or DEB package manually. This topic provides an example of deploying a NebulaGraph cluster on multiple servers (machines).
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/deploy-nebula-graph-cluster/#deployment","title":"Deployment","text":"Machine name IP address Number of graphd Number of storaged Number of metad A 192.168.10.111 1 1 1 B 192.168.10.112 1 1 1 C 192.168.10.113 1 1 1 D 192.168.10.114 1 1 None E 192.168.10.115 1 1 None"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/deploy-nebula-graph-cluster/#prerequisites","title":"Prerequisites","text":"Install NebulaGraph on each machine in the cluster. Available approaches of installation are as follows.
To deploy NebulaGraph according to your requirements, you have to modify the configuration files.
All the configuration files for NebulaGraph, including nebula-graphd.conf
, nebula-metad.conf
, and nebula-storaged.conf
, are stored in the etc
directory in the installation path. You only need to modify the configuration for the corresponding service on the machines. The configurations that need to be modified for each machine are as follows.
nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
B nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
C nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
D nebula-graphd.conf
, nebula-storaged.conf
E nebula-graphd.conf
, nebula-storaged.conf
Users can refer to the content of the following configurations, which only show part of the cluster settings. The hidden content uses the default setting so that users can better understand the relationship between the servers in the NebulaGraph cluster.
Note
The main configuration to be modified is meta_server_addrs
. All configurations need to fill in the IP addresses and ports of all Meta services. At the same time, local_ip
needs to be modified as the network IP address of the machine itself. For detailed descriptions of the configuration parameters, see:
Deploy machine A
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Meta daemon listening port\n--port=9559\n
Deploy machine B
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Meta daemon listening port\n--port=9559\n
Deploy machine C
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Meta daemon listening port\n--port=9559\n
Deploy machine D
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.114\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.114\n# Storage daemon listening port\n--port=9779\n
Deploy machine E
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.115\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.115\n# Storage daemon listening port\n--port=9779\n
Start the corresponding service on each machine. Descriptions are as follows.
Machine name The process to be started A graphd, storaged, metad B graphd, storaged, metad C graphd, storaged, metad D graphd, storaged E graphd, storagedThe command to start the NebulaGraph services is as follows.
sudo /usr/local/nebula/scripts/nebula.service start <metad|graphd|storaged|all>\n
Note
all
instead./usr/local/nebula
is the default installation path for NebulaGraph. Use the actual path if you have customized the path. For more information about how to start and stop the services, see Manage NebulaGraph services.Install the native CLI client NebulaGraph Console, then connect to any machine that has started the graphd process, run ADD HOSTS
command to add storage hosts, and run SHOW HOSTS
to check the cluster status. For example:
$ ./nebula-console --addr 192.168.10.111 --port 9669 -u root -p nebula\n\n2021/05/25 01:41:19 [INFO] connection pool is initialized successfully\nWelcome to NebulaGraph!\n\n> ADD HOSTS 192.168.10.111:9779, 192.168.10.112:9779, 192.168.10.113:9779, 192.168.10.114:9779, 192.168.10.115:9779;\n> SHOW HOSTS;\n+------------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"192.168.10.111\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.112\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.113\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.114\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.115\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+------------------+------+-----------+----------+--------------+----------------------+------------------------+---------+\n
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/","title":"Upgrade NebulaGraph to master","text":"This topic describes how to upgrade NebulaGraph from version 2.x and 3.x to master, taking upgrading from version 2.6.1 to master as an example.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#applicable_source_versions","title":"Applicable source versions","text":"This topic applies to upgrading NebulaGraph from 2.5.0 and later 2.x, and 3.x versions to master. It does not apply to historical versions earlier than 2.5.0, including the 1.x versions.
To upgrade NebulaGraph from historical versions to master:
Caution
To upgrade NebulaGraph from versions earlier than 2.0.0 (including the 1.x versions) to master, you need to find the date_time_zonespec.csv
in the share/resources
directory of master files, and then copy it to the same directory in the NebulaGraph installation path.
Client compatibility
After the upgrade, you will not be able to connect to NebulaGraph from old clients. You will need to upgrade all clients to a version compatible with NebulaGraph master.
Configuration changes
A few configuration parameters have been changed. For more information, see the release notes and configuration docs.
nGQL compatibility
The nGQL syntax is partially incompatible:
YIELD
clause to return custom variables.YIELD
clause is required in the FETCH
, GO
, LOOKUP
, FIND PATH
and GET SUBGRAPH
statements.MATCH
statement. For example, from return v.name
to return v.player.name
.Full-text indexes
Before upgrading a NebulaGraph cluster with full-text indexes deployed, you must manually delete the full-text indexes in Elasticsearch, and then run the SIGN IN
command to log into ES and recreate the indexes after the upgrade is complete. To manually delete the full-text indexes in Elasticsearch, you can use the curl command curl -XDELETE -u <es_username>:<es_password> '<es_access_ip>:<port>/<fullindex_name>'
, for example, curl -XDELETE -u elastic:elastic 'http://192.168.8.xxx:9200/nebula_index_2534'
. If no username and password are set for Elasticsearch, you can omit the -u <es_username>:<es_password>
part.
Caution
There may be other undiscovered influences. Before the upgrade, we recommend that you read the release notes and user manual carefully, and keep an eye on the posts on the forum and issues on Github.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#preparations_before_the_upgrade","title":"Preparations before the upgrade","text":"Download the package of NebulaGraph master according to your operating system and system architecture. You need the binary files during the upgrade. Find the package on the download page.
Note
You can also get the new binaries from the source code or the RPM/DEB package.
Locate the data files based on the value of the data_path
parameters in the Storage and Meta configurations, and backup the data files. The default paths are nebula/data/storage
and nebula/data/meta
.
Danger
The old data will not be automatically backed up during the upgrade. You must manually back up the data to avoid data loss.
Collect the statistics of all graph spaces before the upgrade. After the upgrade, you can collect again and compare the results to make sure that no data is lost. To collect the statistics:
SUBMIT JOB STATS
.SHOW JOBS
and record the result.Stop all NebulaGraph services.
<nebula_install_path>/scripts/nebula.service stop all\n
nebula_install_path
indicates the installation path of NebulaGraph.
The storaged progress needs around 1 minute to flush data. You can run nebula.service status all
to check if all services are stopped. For more information about starting and stopping services, see Manage services.
Note
If the services are not fully stopped in 20 minutes, stop upgrading and ask for help on the forum or Github.
Caution
Starting from version 3.0.0, it is possible to insert vertices without tags. If you need to keep vertices without tags, add --graph_use_vertex_key=true
in the configuration file (nebula-graphd.conf
) of all Graph services within the cluster; and add --use_vertex_key=true
in the configuration file (nebula-storaged.conf
) of all Storage services.\"
In the target path where you unpacked the package, use the binaries in the bin
directory to replace the old binaries in the bin
directory in the NebulaGraph installation path.
Note
Update the binary of the corresponding service on each NebulaGraph server.
Modify the following parameters in all Graph configuration files to accommodate the value range of the new version. If the parameter values are within the specified range, skip this step.
session_idle_timeout_secs
. The recommended value is 28800.client_idle_timeout_secs
. The recommended value is 28800.The default values of these parameters in the 2.x versions are not within the range of the new version. If you do not change the default values, the upgrade will fail. For detailed parameter description, see Graph Service Configuration.
Start all Meta services.
<nebula_install_path>/scripts/nebula-metad.service start\n
Once started, the Meta services take several seconds to elect a leader.
To verify that Meta services are all started, you can start any Graph server, connect to it through NebulaGraph Console, and run SHOW HOSTS meta
and SHOW META LEADER
. If the status of Meta services are correctly returned, the services are successfully started.
Note
If the operation fails, stop the upgrade and ask for help on the forum or GitHub.
Start all the Graph and Storage services.
Note
If the operation fails, stop the upgrade and ask for help on the forum or GitHub.
Connect to the new version of NebulaGraph to verify that services are available and data are complete. For how to connect, see Connect to NebulaGraph.
Currently, there is no official way to check whether the upgrade is successful. You can run the following reference statements to test the upgrade:
nebula> SHOW HOSTS;\nnebula> SHOW HOSTS storage;\nnebula> SHOW SPACES;\nnebula> USE <space_name>\nnebula> SHOW PARTS;\nnebula> SUBMIT JOB STATS;\nnebula> SHOW STATS;\nnebula> MATCH (v) RETURN v LIMIT 5;\n
You can also test against new features in version master.
If the upgrade fails, stop all NebulaGraph services of the new version, recover the old configuration files and binaries, and start the services of the old version.
All NebulaGraph clients in use must be switched to the old version.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#faq","title":"FAQ","text":""},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#can_i_write_through_the_client_during_the_upgrade","title":"Can I write through the client during the upgrade?","text":"A: No. You must stop all NebulaGraph services during the upgrade.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#the_space_0_not_found_warning_message_during_the_upgrade_process","title":"TheSpace 0 not found
warning message during the upgrade process","text":"When the Space 0 not found
warning message appears during the upgrade process, you can ignore it. The space 0
is used to store meta information about the Storage service and does not contain user data, so it will not affect the upgrade.
A: You only need to update the configuration files and binaries of the Graph Service.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#how_to_resolve_the_error_permission_denied","title":"How to resolve the errorPermission denied
?","text":"A: Try again with the sudo privileges.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#is_there_any_change_in_gflags","title":"Is there any change in gflags?","text":"A: Yes. For more information, see the release notes and configuration docs.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#is_there_a_tool_or_solution_for_verifying_data_consistency_after_the_upgrade","title":"Is there a tool or solution for verifying data consistency after the upgrade?","text":"A: No. But if you only want to check the number of vertices and edges, run SUBMIT JOB STATS
and SHOW STATS
after the upgrade, and compare the result with the result that you recorded before the upgrade.
OFFLINE
and Leader count
is 0
?","text":"A: Run the following statement to add the Storage hosts into the cluster manually.
ADD HOSTS <ip>:<port>[, <ip>:<port> ...];\n
For example:
ADD HOSTS 192.168.10.100:9779, 192.168.10.101:9779, 192.168.10.102:9779;\n
If the issue persists, ask for help on the forum or GitHub.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#why_the_job_type_changed_after_the_upgrade_but_job_id_remains_the_same","title":"Why the job type changed after the upgrade, but job ID remains the same?","text":"A: SHOW JOBS
depends on an internal ID to identify job types, but in NebulaGraph 2.5.0 the internal ID changed in this pull request, so this issue happens after upgrading from a version earlier than 2.5.0.
This topic introduces the restrictions for full-text indexes. Please read the restrictions very carefully before using the full-text indexes.
Caution
The full-text index feature has been redone in version 3.6.0 and is not compatible with previous versions. If you want to continue to use wildcards, regulars, fuzzy matches, etc., there are 3 ways to do so as follows:
For now, full-text search has the following limitations:
LOOKUP
statements only.LIMIT
clause to return more records, up to 10,000. You can modify the ElasticSearch parameters to adjust the maximum number of records returned.STRING
or FIXED_STRING
.NULL
.Nebula\u00a0Graph full-text indexes are powered by Elasticsearch. This means that you can use Elasticsearch full-text query language to retrieve what you want. Full-text indexes are managed through built-in procedures. They can be created only for variable STRING
and FIXED_STRING
properties when the listener cluster and the Elasticsearch cluster are deployed.
Before you start using the full-text index, please make sure that you know the restrictions.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#deploy_elasticsearch_cluster","title":"Deploy Elasticsearch cluster","text":"To deploy an Elasticsearch cluster, see Kubernetes Elasticsearch deployment or Elasticsearch installation.
Note
To support external network access to Elasticsearch, set network.host
to 0.0.0.0
in config/elasticsearch.yml
.
You can configure the Elasticsearch to meet your business needs. To customize the Elasticsearch, see Elasticsearch Document.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#sign_in_to_the_text_search_clients","title":"Sign in to the text search clients","text":"When the Elasticsearch cluster is deployed, use the SIGN IN
statement to sign in to the Elasticsearch clients. Multiple elastic_ip:port
pairs are separated with commas. You must use the IPs and the port number in the configuration file for the Elasticsearch.
SIGN IN TEXT SERVICE (<elastic_ip:port>, {HTTP | HTTPS} [,\"<username>\", \"<password>\"]) [, (<elastic_ip:port>, ...)];\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example","title":"Example","text":"nebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);\n
Note
Elasticsearch does not have a username or password by default. If you configured a username and password, you need to specify them in the SIGN IN
statement.
Caution
The Elasticsearch client can only be logged in once, and if there are changes, you need to SIGN OUT
and then SIGN IN
again, and the client takes effect globally, and multiple graph spaces share the same Elasticsearch client.
The SHOW TEXT SEARCH CLIENTS
statement can list the text search clients.
SHOW TEXT SEARCH CLIENTS;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example_1","title":"Example","text":"nebula> SHOW TEXT SEARCH CLIENTS;\n+-----------------+-----------------+------+\n| Type | Host | Port |\n+-----------------+-----------------+------+\n| \"ELASTICSEARCH\" | \"192.168.8.100\" | 9200 |\n+-----------------+-----------------+------+\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#sign_out_to_the_text_search_clients","title":"Sign out to the text search clients","text":"The SIGN OUT TEXT SERVICE
statement can sign out all the text search clients.
SIGN OUT TEXT SERVICE;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example_2","title":"Example","text":"nebula> SIGN OUT TEXT SERVICE;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/","title":"Deploy Raft Listener for NebulaGraph Storage service","text":"Full-text index data is written to the Elasticsearch cluster asynchronously. The Raft Listener (Listener for short) is a separate process that fetches data from the Storage Service and writes them into the Elasticsearch cluster.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#prerequisites","title":"Prerequisites","text":"The Listener service uses the same binary as the storaged service. However, the configuration files are different and the processes use different ports. You can install NebulaGraph on all servers that need to deploy a Listener, but only the storaged service can be used. For details, see Install NebulaGraph by RPM or DEB Package.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_2_prepare_the_configuration_file_for_the_listener","title":"Step 2: Prepare the configuration file for the Listener","text":"In the etc
directory, remove the suffix from nebula-storaged-listener.conf.default
or nebula-storaged-listener.conf.production
to nebula-storaged-listener.conf
, and then modify the configuration content.
Most configurations are the same as the configurations of Storage Service. This topic only introduces the differences.
Name Default value Descriptiondaemonize
true
When set to true
, the process is a daemon process. pid_file
pids/nebula-metad-listener.pid
The file that records the process ID. meta_server_addrs
- IP (or hostname) and ports of all Meta services. Multiple Meta services are separated by commas. local_ip
- The local IP (or hostname) of the Listener service. Use real IP addresses instead of domain names or loopback IP addresses such as 127.0.0.1
. port
- The listening port of the RPC daemon of the Listener service. heartbeat_interval_secs
10
The heartbeat interval of the Meta service. The unit is second (s). listener_path
data/listener
The WAL directory of the Listener. Only one directory is allowed. data_path
data
For compatibility reasons, this parameter can be ignored. Fill in the default value data
. part_man_type
memory
The type of the part manager. Optional values \u200b\u200bare memory
and meta
. rocksdb_batch_size
4096
The default reserved bytes for batch operations. rocksdb_block_cache
4
The default block cache size of BlockBasedTable. The unit is Megabyte (MB). engine_type
rocksdb
The type of the Storage engine, such as rocksdb
, memory
, etc. part_type
simple
The type of the part, such as simple
, consensus
, etc."},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_3_start_listeners","title":"Step 3: Start Listeners","text":"To initiate the Listener, navigate to the installation path of the desired cluster and execute the following command:
./bin/nebula-storaged --flagfile etc/nebula-storaged-listener.conf\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_4_add_listeners_to_nebulagraph","title":"Step 4: Add Listeners to NebulaGraph","text":"Connect to NebulaGraph and run USE <space>
to enter the graph space that you want to create full-text indexes for. Then run the following statement to add a Listener into NebulaGraph.
ADD LISTENER ELASTICSEARCH <listener_ip:port> [,<listener_ip:port>, ...]\n
Warning
You must use real IPs for a Listener.
Add all Listeners in one statement completely.
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789,192.168.8.101:9789;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#show_listeners","title":"Show Listeners","text":"Run the SHOW LISTENER
statement to list all Listeners.
nebula> SHOW LISTENER;\n+--------+-----------------+------------------------+-------------+\n| PartId | Type | Host | Host Status |\n+--------+-----------------+------------------------+-------------+\n| 1 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 2 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 3 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n+--------+-----------------+------------------------+-------------+\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#remove_listeners","title":"Remove Listeners","text":"Run the REMOVE LISTENER ELASTICSEARCH
statement to remove all Listeners in a graph space.
nebula> REMOVE LISTENER ELASTICSEARCH;\n
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/","title":"Configurations","text":"NebulaGraph builds the configurations based on the gflags repository. Most configurations are flags. When the NebulaGraph service starts, it will get the configuration information from Configuration files by default. Configurations that are not in the file apply the default values.
Note
Legacy version compatibility
In the topic of 1.x, we provide a method of using the CONFIGS
command to modify the configurations in the cache. However, using this method in a production environment can easily cause inconsistencies of configurations between clusters and the local. Therefore, this method will no longer be introduced starting with version 2.x.
Use the following command to get all the configuration information of the service corresponding to the binary file:
<binary> --help\n
For example:
# Get the help information from Meta\n$ /usr/local/nebula/bin/nebula-metad --help\n\n# Get the help information from Graph\n$ /usr/local/nebula/bin/nebula-graphd --help\n\n# Get the help information from Storage\n$ /usr/local/nebula/bin/nebula-storaged --help\n
The above examples use the default storage path /usr/local/nebula/bin/
. If you modify the installation path of NebulaGraph, use the actual path to query the configurations.
Use the curl
command to get the value of the running configurations.
For example:
# Get the running configurations from Meta\ncurl 127.0.0.1:19559/flags\n\n# Get the running configurations from Graph\ncurl 127.0.0.1:19669/flags\n\n# Get the running configurations from Storage\ncurl 127.0.0.1:19779/flags\n
Utilizing the -s
or `-silent option allows for the concealment of the progress bar and error messages. For example:
curl -s 127.0.0.1:19559/flags\n
Note
In an actual environment, use the real IP (or hostname) instead of 127.0.0.1
in the above example.
NebulaGraph provides two initial configuration files for each service, <service_name>.conf.default
and <service_name>.conf.production
. You can use them in different scenarios conveniently. For clusters installed from source and with a RPM/DEB package, the default path is /usr/local/nebula/etc/
. For clusters installed with a TAR package, the path is <install_path>/<tar_package_directory>/etc
.
The configuration values in the initial configuration file are for reference only and can be adjusted according to actual needs. To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
to make it valid.
Note
To ensure the availability of services, it is recommended that configurations for the same service be consistent, except for local_ip
. For example, three Storage servers are deployed in one NebulaGraph cluster. The configurations of the three Storage servers are recommended to be consistent, except for local_ip
.
The initial configuration files corresponding to each service are as follows.
NebulaGraph service Initial configuration file Description Metanebula-metad.conf.default
and nebula-metad.conf.production
Meta service configuration Graph nebula-graphd.conf.default
and nebula-graphd.conf.production
Graph service configuration Storage nebula-storaged.conf.default
and nebula-storaged.conf.production
Storage service configuration Each initial configuration file of all services contains local_config
. The default value is true
, which means that the NebulaGraph service will get configurations from its configuration files and start it.
Caution
It is not recommended to modify the value of local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.
For clusters installed with Docker Compose, the configuration file's default installation path of the cluster is <install_path>/nebula-docker-compose/docker-compose.yaml
. The parameters in the command
field of the file are the launch parameters for each service.
For clusters installed with Kubectl through NebulaGraph Operator, the configuration file's path is the path of the cluster YAML file. You can modify the configuration of each service through the spec.{graphd|storaged|metad}.config
parameter.
Note
The services cannot be configured for clusters installed with Helm.
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/#modify_configurations","title":"Modify configurations","text":"You can modify the configurations of NebulaGraph in the configuration file or use commands to dynamically modify configurations.
Caution
Using both methods to modify the configuration can cause the configuration information to be managed inconsistently, which may result in confusion. It is recommended to only use the configuration file to manage the configuration, or to make the same modifications to the configuration file after dynamically updating the configuration through commands to ensure consistency.
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/#modifying_configurations_in_the_configuration_file","title":"Modifying configurations in the configuration file","text":"By default, each NebulaGraph service gets configured from its configuration files. You can modify configurations and make them valid according to the following steps:
For clusters installed from source, with a RPM/DEB, or a TAR package
Use a text editor to modify the configuration files of the target service and save the modification.
Choose an appropriate time to restart all NebulaGraph services to make the modifications valid.
For clusters installed with Docker Compose
<install_path>/nebula-docker-compose/docker-compose.yaml
file, modify the configurations of the target service.nebula-docker-compose
directory, run the command docker-compose up -d
to restart the service involving configuration modifications.For clusters installed with Kubectl
For details, see Customize configuration parameters for a NebulaGraph cluster.
You can dynamically modify the configuration of NebulaGraph by using the curl command. For example, to modify the wal_ttl
parameter of the Storage service to 600
, use the following command:
curl -X PUT -H \"Content-Type: application/json\" -d'{\"wal_ttl\":\"600\"}' -s \"http://192.168.15.6:19779/flags\"\n
In this command, {\"wal_ttl\":\"600\"}
specifies the configuration parameter and its value to be modified, and 192.168.15.6:19779
specifies the IP address and HTTP port number of the Storage service.
Caution
local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted.NebulaGraph provides two initial configuration files for the Meta service, nebula-metad.conf.default
and nebula-metad.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-metad.pid
The file that records the process ID. No timezone_name
- Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is UTC+00:00:00
. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC time.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Meta Service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
metad-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
metad-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Meta Service. The local IP address is used to identify the nebula-metad process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No port
9559
Specifies RPC daemon listening port of the Meta service. The neighboring +1
(9560
) port is used for Raft communication between Meta services. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19559
Specifies the port for the HTTP service. No ws_storage_http_port
19779
Specifies the Storage service listening port used by the HTTP protocol. It must be consistent with the ws_http_port
in the Storage service configuration file. This parameter only applies to standalone NebulaGraph. No Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
data_path
data/meta
The storage path for Meta data. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#misc_configurations","title":"Misc configurations","text":"Name Predefined Value Description Whether supports runtime dynamic modifications default_parts_num
10
Specifies the default partition number when creating a new graph space. No default_replica_factor
1
Specifies the default replica number when creating a new graph space. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes agent_heartbeat_interval_secs
60
Specifies the default heartbeat interval for the Agent service. This configuration influences the time it takes for the system to determine that the Agent service is offline. This configuration is measured in seconds. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#rocksdb_options_configurations","title":"RocksDB options configurations","text":"Name Predefined Value Description Whether supports runtime dynamic modifications rocksdb_wal_sync
true
Enables or disables RocksDB WAL synchronization. Available values are true
(enable) and false
(disable). No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/","title":"Graph Service configuration","text":"NebulaGraph provides two initial configuration files for the Graph Service, nebula-graphd.conf.default
and nebula-graphd.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-graphd.pid
The file that records the process ID. No enable_optimizer
true
When set to true
, the optimizer is enabled. No timezone_name
- Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files. The system default value is UTC+00:00:00
. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No default_charset
utf8
Specifies the default charset when creating a new graph space. No default_collate
utf8_bin
Specifies the default collate when creating a new graph space. No local_config
true
When set to true
, the process gets configurations from the configuration files. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC time.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Graph service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
graphd-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
graphd-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#query_configurations","title":"Query configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications accept_partial_success
false
When set to false
, the process treats partial success as an error. This configuration only applies to read-only requests. Write requests always treat partial success as an error. A partial success query will prompt Got partial result
. Yes session_reclaim_interval_secs
60
Specifies the interval that the Session information is sent to the Meta service. This configuration is measured in seconds. Yes max_allowed_query_size
4194304
Specifies the maximum length of queries. Unit: bytes. The default value is 4194304
, namely 4MB. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Graph Service. The local IP address is used to identify the nebula-graphd process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No listen_netdev
any
Specifies the listening network device. No port
9669
Specifies RPC daemon listening port of the Graph service. No reuse_port
false
When set to false
, the SO_REUSEPORT
is closed. No listen_backlog
1024
Specifies the maximum length of the connection queue for socket monitoring. This configuration must be modified together with the net.core.somaxconn
. No client_idle_timeout_secs
28800
Specifies the time to expire an idle connection. The value ranges from 1 to 604800. The default is 8 hours. This configuration is measured in seconds. No session_idle_timeout_secs
28800
Specifies the time to expire an idle session. The value ranges from 1 to 604800. The default is 8 hours. This configuration is measured in seconds. No num_accept_threads
1
Specifies the number of threads that accept incoming connections. No num_netio_threads
0
Specifies the number of networking IO threads. 0
is the number of CPU cores. No num_max_connections
0
Max active connections for all networking threads. 0 means no limit.Max connections for each networking thread = num_max_connections / num_netio_threads No num_worker_threads
0
Specifies the number of threads that execute queries. 0
is the number of CPU cores. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19669
Specifies the port for the HTTP service. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes storage_client_timeout_ms
- Specifies the RPC connection timeout threshold between the Graph Service and the Storage Service. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is 60000
ms. No slow_query_threshold_us
200000
When the execution time of a query exceeds the value, the query is called a slow query. Unit: Microsecond.Note: Even if the execution time of DML statements exceeds this value, they will not be recorded as slow queries. No ws_meta_http_port
19559
Specifies the Meta service listening port used by the HTTP protocol. It must be consistent with the ws_http_port
in the Meta service configuration file. No Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
enable_authorize
false
When set to false
, the system authentication is not enabled. For more information, see Authentication. No auth_type
password
Specifies the login method. Available values are password
, ldap
, and cloud
. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#memory_configurations","title":"Memory configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications system_memory_high_watermark_ratio
0.8
Specifies the trigger threshold of the high-level memory alarm mechanism. If the system memory usage is higher than this value, an alarm mechanism will be triggered, and NebulaGraph will stop querying. This parameter is not predefined in the initial configuration files. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#metrics_configurations","title":"Metrics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications enable_space_level_metrics
false
Enable or disable space-level metrics. Such metric names contain the name of the graph space that it monitors, for example, query_latency_us{space=basketballplayer}.avg.3600
. You can view the supported metrics with the curl
command. For more information, see Query NebulaGraph metrics. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#session_configurations","title":"Session configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications max_sessions_per_ip_per_user
300
The maximum number of active sessions that can be created from a single IP adddress for a single user. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#experimental_configurations","title":"Experimental configurations","text":"Note
The switch of the experimental feature is only available in the Community Edition.
Name Predefined value Description Whether supports runtime dynamic modificationsenable_experimental_feature
false
Specifies the experimental feature. Optional values are true
and false
. No enable_data_balance
true
Whether to enable the BALANCE DATA feature. Only works when enable_experimental_feature
is true
. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#memory_tracker_configurations","title":"Memory tracker configurations","text":"Note
Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Graph service.
/sys/fs/cgroup/graphd/
, and then add and configure the memory.max
file under the directory.Add the following configurations to etc/nebula-graphd.conf
.
--containerized=true\n--cgroup_v2_controllers=/sys/fs/cgroup/graphd/cgroup.controllers\n--cgroup_v2_memory_stat_path=/sys/fs/cgroup/graphd/memory.stat\n--cgroup_v2_memory_max_path=/sys/fs/cgroup/graphd/memory.max\n--cgroup_v2_memory_current_path=/sys/fs/cgroup/graphd/memory.current\n
For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.
Name Predefined value Description Whether supports runtime dynamic modificationsmemory_tracker_limit_ratio
0.8
The value of this parameter can be set to (0, 1]
, 2
, and 3
.Caution: When setting this parameter, ensure that the value of system_memory_high_watermark_ratio
is not set to 1
, otherwise the value of this parameter will not take effect.(0, 1]
: The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory)
.When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio
should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5
.2
: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. 3
: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded. Yes memory_tracker_untracked_reserved_memory_mb
50
The reserved memory that is not tracked by the memory tracker. Unit: MB. Yes memory_tracker_detail_log
false
Whether to enable the memory tracker log. When the value is true
, the memory tracker log is generated. Yes memory_tracker_detail_log_interval_ms
60000
The time interval for generating the memory tracker log. Unit: Millisecond. memory_tracker_detail_log
is true
when this parameter takes effect. Yes memory_purge_enabled
true
Whether to enable the memory purge feature. When the value is true
, the memory purge feature is enabled. Yes memory_purge_interval_seconds
10
The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled
is set to true. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#performance_optimization_configurations","title":"performance optimization configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications max_job_size
1
The maximum number of concurrent jobs, i.e., the maximum number of threads used in the phase of query execution where concurrent execution is possible. It is recommended to be half of the physical CPU cores. Yes min_batch_size
8192
The minimum batch size for processing the dataset. Takes effect only when max_job_size
is greater than 1. Yes optimize_appendvertices
false
When enabled, the MATCH
statement is executed without filtering dangling edges. Yes path_batch_size
10000
The number of paths constructed per thread. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/","title":"Storage Service configurations","text":"NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default
and nebula-storaged.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
. For parameters that are not included in nebula-metad.conf.default
, see nebula-storaged.conf.production
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
Note
The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-storaged.pid
The file that records the process ID. No timezone_name
UTC+00:00:00
Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No local_config
true
When set to true
, the process gets configurations from the configuration files. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Storage service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
graphd-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
graphd-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No port
9779
Specifies RPC daemon listening port of the Storage service. The neighboring ports -1
(9778
) and +1
(9780
) are also used. 9778
: The port used by the Admin service, which receives Meta commands for Storage. 9780
: The port used for Raft communication between Storage services. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19779
Specifies the port for the HTTP service. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
raft_heartbeat_interval_secs
30
Specifies the time to expire the Raft election. The configuration is measured in seconds. Yes raft_rpc_timeout_ms
500
Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds. Yes wal_ttl
14400
Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#disk_configurations","title":"Disk configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications data_path
data/storage
Specifies the data storage path. Multiple paths are separated with commas. For NebulaGraph of the community edition, one RocksDB instance corresponds to one path. No minimum_reserved_bytes
268435456
Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes. No rocksdb_batch_size
4096
Specifies the block cache for a batch operation. The configuration is measured in bytes. No rocksdb_block_cache
4
Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes. No disable_page_cache
false
Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is false
and page cache is enabled. If the value is set to true
, page cache is disabled and sufficient block cache space must be configured for NebulaGraph. No engine_type
rocksdb
Specifies the engine type. No rocksdb_compression
lz4
Specifies the compression algorithm for RocksDB. Optional values are no
, snappy
, lz4
, lz4hc
, zlib
, bzip2
, and zstd
.This parameter modifies the compression algorithm for each level. If you want to set different compression algorithms for each level, use the parameter rocksdb_compression_per_level
. No rocksdb_compression_per_level
\\ Specifies the compression algorithm for each level. The priority is higher than rocksdb_compression
. For example, no:no:lz4:lz4:snappy:zstd:snappy
.You can also not set certain levels of compression algorithms, for example, no:no:lz4:lz4::zstd
, level L4 and L6 use the compression algorithm of rocksdb_compression
. No enable_rocksdb_statistics
false
When set to false
, RocksDB statistics is disabled. No rocksdb_stats_level
kExceptHistogramOrTimers
Specifies the stats level for RocksDB. Optional values are kExceptHistogramOrTimers
, kExceptTimers
, kExceptDetailedTimers
, kExceptTimeForMutex
, and kAll
. No enable_rocksdb_prefix_filtering
true
When set to true
, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory. No enable_rocksdb_whole_key_filtering
false
When set to true
, the whole key bloom filter for RocksDB is enabled. rocksdb_filtering_prefix_length
12
Specifies the prefix length for each key. Optional values are 12
and 16
. The configuration is measured in bytes. No enable_partitioned_index_filter
false
When set to true
, it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#rocksdb_options","title":"RocksDB options","text":"Name Predefined value Description Whether supports runtime dynamic modifications rocksdb_db_options
{}
Specifies the RocksDB database options. No rocksdb_column_family_options
{\"write_buffer_size\":\"67108864\",
\"max_write_buffer_number\":\"4\",
\"max_bytes_for_level_base\":\"268435456\"}
Specifies the RocksDB column family options. No rocksdb_block_based_table_options
{\"block_size\":\"8192\"}
Specifies the RocksDB block based table options. No The format of the RocksDB option is {\"<option_name>\":\"<option_value>\"}
. Multiple options are separated with commas.
Supported options of rocksdb_db_options
and rocksdb_column_family_options
are listed as follows.
rocksdb_db_options
max_total_wal_size\ndelete_obsolete_files_period_micros\nmax_background_jobs\nstats_dump_period_sec\ncompaction_readahead_size\nwritable_file_max_buffer_size\nbytes_per_sync\nwal_bytes_per_sync\ndelayed_write_rate\navoid_flush_during_shutdown\nmax_open_files\nstats_persist_period_sec\nstats_history_buffer_size\nstrict_bytes_per_sync\nenable_rocksdb_prefix_filtering\nenable_rocksdb_whole_key_filtering\nrocksdb_filtering_prefix_length\nnum_compaction_threads\nrate_limit\n
rocksdb_column_family_options
write_buffer_size\nmax_write_buffer_number\nlevel0_file_num_compaction_trigger\nlevel0_slowdown_writes_trigger\nlevel0_stop_writes_trigger\ntarget_file_size_base\ntarget_file_size_multiplier\nmax_bytes_for_level_base\nmax_bytes_for_level_multiplier\ndisable_auto_compactions \n
For more information, see RocksDB official documentation.
"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#misc_configurations","title":"Misc configurations","text":"Caution
The configuration snapshot
in the following table is different from the snapshot in NebulaGraph. The snapshot
here refers to the stock data on the leader when synchronizing Raft.
query_concurrently
true
Whether to turn on multi-threaded queries. Enabling it can improve the latency performance of individual queries, but it will reduce the overall throughput under high pressure. Yes auto_remove_invalid_space
true
After executing DROP SPACE
, the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is true
, all the data in the specified graph space will be deleted at the same time. Yes num_io_threads
16
The number of network I/O threads used to send RPC requests and receive responses. No num_max_connections
0
Max active connections for all networking threads. 0 means no limit.Max connections for each networking thread = num_max_connections / num_netio_threads No num_worker_threads
32
The number of worker threads for one RPC-based Storage service. No max_concurrent_subtasks
10
The maximum number of concurrent subtasks to be executed by the task manager. No snapshot_part_rate_limit
10485760
The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s. Yes snapshot_batch_size
1048576
The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes. Yes rebuild_index_part_rate_limit
4194304
The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s. Yes rebuild_index_batch_size
1048576
The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#memory_tracker_configurations","title":"Memory Tracker configurations","text":"Note
Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Storage service.
/sys/fs/cgroup/storaged/
, and then add and configure the memory.max
file under the directory.Add the following configurations to etc/nebula-storaged.conf
.
--containerized=true\n--cgroup_v2_controllers=/sys/fs/cgroup/graphd/cgroup.controllers\n--cgroup_v2_memory_stat_path=/sys/fs/cgroup/graphd/memory.stat\n--cgroup_v2_memory_max_path=/sys/fs/cgroup/graphd/memory.max\n--cgroup_v2_memory_current_path=/sys/fs/cgroup/graphd/memory.current\n
For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.
Name Predefined value Description Whether supports runtime dynamic modificationsmemory_tracker_limit_ratio
0.8
The value of this parameter can be set to (0, 1]
, 2
, and 3
.(0, 1]
: The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory)
.When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio
should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5
.2
: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. 3
: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded. Yes memory_tracker_untracked_reserved_memory_mb
50
The reserved memory that is not tracked by the Memory Tracker. Unit: MB. Yes memory_tracker_detail_log
false
Whether to enable the Memory Tracker log. When the value is true
, the Memory Tracker log is generated. Yes memory_tracker_detail_log_interval_ms
60000
The time interval for generating the Memory Tracker log. Unit: Millisecond. memory_tracker_detail_log
is true
when this parameter takes effect. Yes memory_purge_enabled
true
Whether to enable the memory purge feature. When the value is true
, the memory purge feature is enabled. Yes memory_purge_interval_seconds
10
The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled
is set to true. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#for_super-large_vertices","title":"For super-Large vertices","text":"When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex
parameter. Excess edges will not be returned. This parameter applies to all spaces.
2147483647
Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#storage_configurations_for_large_dataset","title":"Storage configurations for large dataset","text":"Warning
One graph space takes up at least about 300 MB of memory.
When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter
parameter to true
. The performance is affected because RocksDB indexes are cached.
This topic introduces the Kernel configurations in Nebula\u00a0Graph.
"},{"location":"5.configurations-and-logs/1.configurations/6.kernel-config/#resource_control","title":"Resource control","text":"You may run the ulimit
command to control the resource threshold. However, the changes made only take effect for the current session or sub-process. To make permanent changes, edit file /etc/security/limits.conf
. The configuration is as follows:
# <domain> <type> <item> <value>\n* soft core unlimited \n* hard core unlimited \n* soft nofile 130000 \n* hard nofile 130000\n
Note
The configuration modification takes effect for new sessions.
The parameter descriptions are as follows.
Parameter Descriptiondomain
Control Domain. This parameter can be a user name, a user group name (starting with @
), or *
to indicate all users. type
Control type. This parameter can be soft
or hard
. soft
indicates a soft threshold (the default threshold) for the resource and hard
indicates a maximum value that can be set by the user. The ulimit
command can be used to increase soft
, but not to exceed hard
. item
Resource types. For example, core
limits the size of the core dump file, and nofile
limits the maximum number of file descriptors a process can open. value
Resource limit value. This parameter can be a number, or unlimited
to indicate that there is no limit. You can run man limits.conf
for more helpful information.
vm.swappiness
specifies the percentage of the available memory before starting swap. The greater the value, the more likely the swap occurs. We recommend that you set it to 0. When set to 0, the page cache is removed first. Note that when vm.swappiness
is 0, it does not mean that there is no swap.
vm.min_free_kbytes
specifies the minimum number of kilobytes available kept by Linux VM. If you have a large system memory, we recommend that you increase this value. For example, if your physical memory 128GB, set it to 5GB. If the value is not big enough, the system cannot apply for enough continuous physical memory.
vm.max_map_count
limits the maximum number of vma (virtual memory area) for a process. The default value is 65530
. It is enough for most applications. If your memory application fails because the memory consumption is large, increase the vm.max_map_count
value.
These values control the dirty data cache for the system. For write-intensive scenarios, you can make adjustments based on your needs (throughput priority or delay priority). We recommend that you use the system default value.
"},{"location":"5.configurations-and-logs/1.configurations/6.kernel-config/#transparent_huge_pages","title":"Transparent Huge Pages","text":"Transparent Huge Pages (THP) is a memory management feature of the Linux kernel, which enhances the system's ability to use large pages. In most database systems, Transparent Huge Pages can degrade performance, so it is recommended to disable it.
Perform the following steps:
Edit the GRUB configuration file /etc/default/grub
.
sudo vi /etc/default/grub\n
Add transparent_hugepage=never
to the GRUB_CMDLINE_LINUX
option, and then save and exit.
GRUB_CMDLINE_LINUX=\"... transparent_hugepage=never\"\n
Update the GRUB configuration.
For CentOS:
sudo grub2-mkconfig -o /boot/grub2/grub.cfg\n
For Ubuntu:
sudo update-grub\n
Reboot the computer.
sudo reboot\n
If you don't want to reboot, you can run the following commands to temporarily disable THP until the next reboot.
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled\necho 'never' > /sys/kernel/mm/transparent_hugepage/defrag\n
The default value of net.ipv4.tcp_slow_start_after_idle
is 1
. If set, the congestion window is timed out after an idle period. We recommend that you set it to 0
, especially for long fat scenarios (high latency and large bandwidth).
net.core.somaxconn
specifies the maximum number of connection queues listened by the socket. The default value is 128
. For scenarios with a large number of burst connections, we recommend that you set it to greater than 1024
.
net.ipv4.tcp_max_syn_backlog
specifies the maximum number of TCP connections in the SYN_RECV (semi-connected) state. The setting rule for this parameter is the same as that of net.core.somaxconn
.
net.core.netdev_max_backlog
specifies the maximum number of packets. The default value is 1000
. We recommend that you increase it to greater than 10,000
, especially for 10G network adapters.
These values keep parameters alive for TCP connections. For applications that use a 4-layer transparent load balancer, if the idle connection is disconnected unexpectedly, decrease the values of tcp_keepalive_time
and tcp_keepalive_intvl
.
net.ipv4.tcp_wmem/rmem
specifies the minimum, default, and maximum size of the buffer pool sent/received by the TCP socket. For long fat links, we recommend that you increase the default value to bandwidth (GB) * RTT (ms)
.
For SSD devices, we recommend that you set scheduler
to noop
or none
. The path is /sys/block/DEV_NAME/queue/scheduler
.
we recommend that you set it to core
and set kernel.core_uses_pid
to 1
.
sysctl <conf_name>
Checks the current parameter value.
sysctl -w <conf_name>=<value>
Modifies the parameter value. The modification takes effect immediately. The original value is restored after restarting.
sysctl -p [<file_path>]
Loads Linux parameter values \u200b\u200bfrom the specified configuration file. The default path is /etc/sysctl.conf
.
The prlimit
command gets and sets process resource limits. You can modify the hard threshold by using it and the sudo
command. For example, prlimit --nofile = 130000 --pid = $$
adjusts the maximum number of open files permitted by the current process to 14000
. And the modification takes effect immediately. Note that this command is only available in RedHat 7u or higher versions.
Runtime logs are provided for DBAs and developers to locate faults when the system fails.
NebulaGraph uses glog to print runtime logs, uses gflags to control the severity level of the log, and provides an HTTP interface to dynamically change the log level at runtime to facilitate tracking.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#log_directory","title":"Log directory","text":"The default runtime log directory is /usr/local/nebula/logs/
.
If the log directory is deleted while NebulaGraph is running, the log would not continue to be printed. However, this operation will not affect the services. To recover the logs, restart the services.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#parameter_descriptions","title":"Parameter descriptions","text":"minloglevel
: Specifies the minimum level of the log. That is, no logs below this level will be printed. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs.v
: Specifies the detailed level of the log. The larger the value, the more detailed the log is. Optional values are 0
, 1
, 2
, 3
.The default severity level for the metad, graphd, and storaged logs can be found in their respective configuration files. The default path is /usr/local/nebula/etc/
.
Check all the flag values (log values included) of the current gflags with the following command.
$ curl <ws_ip>:<ws_port>/flags\n
Parameter Description ws_ip
The IP address for the HTTP service, which can be found in the configuration files above. The default value is 127.0.0.1
. ws_port
The port for the HTTP service, which can be found in the configuration files above. The default values are 19559
(Meta), 19669
(Graph), and 19779
(Storage) respectively. Examples are as follows:
minloglevel
in the Meta service:$ curl 127.0.0.1:19559/flags | grep 'minloglevel'\n
v
in the Storage service:$ curl 127.0.0.1:19779/flags | grep -w 'v'\n
Change the severity level of the log with the following command.
$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"<key>\":<value>[,\"<key>\":<value>]}' \"<ws_ip>:<ws_port>/flags\"\n
Parameter Description key
The type of the log to be changed. For optional values, see Parameter descriptions. value
The level of the log. For optional values, see Parameter descriptions. ws_ip
The IP address for the HTTP service, which can be found in the configuration files above. The default value is 127.0.0.1
. ws_port
The port for the HTTP service, which can be found in the configuration files above. The default values are 19559
(Meta), 19669
(Graph), and 19779
(Storage) respectively. Examples are as follows:
$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19779/flags\" # storaged\n$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19669/flags\" # graphd\n$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19559/flags\" # metad\n
If the log level is changed while NebulaGraph is running, it will be restored to the level set in the configuration file after restarting the service. To permanently modify it, see Configuration files.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#rocksdb_runtime_logs","title":"RocksDB runtime logs","text":"RocksDB runtime logs are usually used to debug RocksDB parameters and stored in /usr/local/nebula/data/storage/nebula/$id/data/LOG
. $id
is the ID of the example.
Glog does not inherently support log recycling. To implement this feature, you can either use cron jobs in Linux to regularly remove old log files or use the log management tool, logrotate, to rotate logs for regular archiving and deletion.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#log_recycling_using_cron_jobs","title":"Log recycling using cron jobs","text":"This section provides an example of how to use cron jobs to regularly delete old log files from the Graph service's runtime logs.
In the Graph service configuration file, apply the following settings and restart the service:
timestamp_in_logfile_name = true\nmax_log_size = 500\n
timestamp_in_logfile_name
to true
, the log file name includes a timestamp, allowing regular deletion of old log files.max_log_size
parameter sets the maximum size of a single log file in MB, such as 500
. Once this size is exceeded, a new log file is automatically created. The default value is 1800
.Use the following command to open the cron job editor.
crontab -e\n
Add a cron job command to the editor to regularly delete old log files.
* * * * * find <log_path> -name \"<YourProjectName>\" -mtime +7 -delete\n
Caution
The find
command in the above command should be executed by the root user or a user with sudo privileges.
* * * * *
: This cron job time field signifies that the task is executed every minute. For other settings, see Cron Expression.<log_path>
: The path of the service runtime log file, such as /usr/local/nebula/logs
.<YourProjectName>
: The log file name, such as nebula-graphd.*
.-mtime +7
: This deletes log files that are older than 7 days. Alternatively, use -mmin +n
to delete log files older than n
minutes. For details, see the find command.-delete
: This deletes log files that meet the conditions.For example, to automatically delete the Graph service runtime log files older than 7 days at 3 o'clock every morning, use:
0 3 * * * find /usr/local/nebula/logs -name nebula-graphd.* -mtime +7 -delete\n
Save the cron job and exit the editor.
Logrotate is a tool that can rotate specified log files for archiving and recycling.
Note
You must be the root user or a user with sudo privileges to install or run logrotate.
This section provides an example of how to use logrotate to manage the Graph service's INFO
level log file (/usr/local/nebula/logs/nebula-graphd.INFO.impl
).
In the Graph service configuration file, set timestamp_in_logfile_name
to false
so that the logrotate tool can recognize the log file name. Then, restart the service.
timestamp_in_logfile_name = false\n
Install logrotate.
For Debian/Ubuntu:
sudo apt-get install logrotate\n
For CentOS/RHEL:
sudo yum install logrotate\n
Create a logrotate configuration file, add log rotation rules, and save the configuration file.
In the /etc/logrotate.d
directory, create a new logrotate configuration file nebula-graphd.INFO
.
sudo vim /etc/logrotate.d/nebula-graphd.INFO\n
Then, add the following content:
# The absolute path of the log file needs to be configured\n# And the file name cannot be a symbolic link file, such as `nebula-graph.INFO`\n/usr/local/nebula/logs/nebula-graphd.INFO.impl {\n daily\n rotate 2\n copytruncate\n nocompress\n missingok\n notifempty\n create 644 root root\n dateext\n dateformat .%Y-%m-%d-%s\n maxsize 1k\n}\n
Parameter Description daily
Rotate the log daily. Other available time units include hourly
, daily
, weekly
, monthly
, and yearly
. rotate 2
Keep the most recent 2 log files before deleting the older one. copytruncate
Copy the current log file and then truncate it, ensuring no disruption to the logging process. nocompress
Do not compress the old log files. missingok
Do not report errors if the log file is missing. notifempty
Do not rotate the log file if it's empty. create 644 root root
Create a new log file with the specified permissions and ownership. dateext
Add a date extension to the log file name. The default is the current date in the format -%Y%m%d
. You can extend this using the dateformat
option. dateformat .%Y-%m-%d-%s
This must follow immediately after dateext
and defines the file name after log rotation. Before V3.9.0, only %Y
, %m
, %d
, and %s
parameters were supported. Starting from V3.9.0, the %H
parameter is also supported. maxsize 1k
Rotate the log when it exceeds 1 kilobyte (1024
bytes) in size or when the specified time unit (e.g., daily
) passes. You can use size units like k
and M
, with the default unit being bytes. Modify the parameters in the configuration file according to actual needs. For more information about parameter configuration, see logrotate.
Test the logrotate configuration.
To verify whether the logrotate configuration is correct, use the following command for testing.
sudo logrotate --debug /etc/logrotate.d/nebula-graphd.INFO\n
Execute logrotate.
Although logrotate
is typically executed automatically by cron jobs, you can manually execute the following command to perform log rotation immediately.
sudo logrotate -fv /etc/logrotate.d/nebula-graphd.INFO\n
-fv
: f
stands for forced execution, v
stands for verbose output.
Verify the log rotation results.
After log rotation, new log files are found in the /usr/local/nebula/logs
directory, such as nebula-graphd.INFO.impl.2024-01-04-1704338204
. The original log content is cleared, but the file is retained for new log entries. When the number of log files exceeds the value set by rotate
, the oldest log file is deleted.
For example, rotate
2` means keeping the 2 most recently generated log files. When the number of log files exceeds 2, the oldest log file is deleted.
[test@test logs]$ ll\n-rw-r--r-- 1 root root 0 Jan 4 11:18 nebula-graphd.INFO.impl \n-rw-r--r-- 1 root root 6894 Jan 4 11:16 nebula-graphd.INFO.impl.2024-01-04-1704338204 # This file is deleted when a new log file is generated\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338287\n[test@test logs]$ ll\n-rw-r--r-- 1 root root 0 Jan 4 11:18 nebula-graphd.INFO.impl\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338287\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338339 # The new log file is generated\n
If you need to rotate multiple log files, create multiple configuration files in the /etc/logrotate.d
directory, with each configuration file corresponding to a log file. For example, to rotate the INFO
level log file and the WARNING
level log file of the Meta service, create two configuration files nebula-metad.INFO
and nebula-metad.WARNING
, and add log rotation rules in them respectively.
NebulaGraph supports querying the monitoring metrics through HTTP ports.
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#metrics_structure","title":"Metrics structure","text":"Each metric of NebulaGraph consists of three fields: name, type, and time range. The fields are separated by periods, for example, num_queries.sum.600
. Different NebulaGraph services (Graph, Storage, or Meta) support different metrics. The detailed description is as follows.
num_queries
Indicates the function of the metric. Metric type sum
Indicates how the metrics are collected. Supported types are SUM, AVG, RATE, and the P-th sample quantiles such as P75, P95, P99, and P999. Time range 600
The time range in seconds for the metric collection. Supported values are 5, 60, 600, and 3600, representing the last 5 seconds, 1 minute, 10 minutes, and 1 hour."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_metrics_over_http","title":"Query metrics over HTTP","text":""},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#syntax","title":"Syntax","text":"curl -G \"http://<host>:<port>/stats?stats=<metric_name_list> [&format=json]\"\n
Parameter Description host
The IP (or hostname) of the server. You can find it in the configuration file in the installation directory. port
The HTTP port of the server. You can find it in the configuration file in the installation directory. The default ports are 19559 (Meta), 19669 (Graph), and 19779 (Storage). metric_name_list
The metrics names. Multiple metrics are separated by commas (,). &format=json
Optional. Returns the result in the JSON format. Note
If NebulaGraph is deployed with Docker Compose, run docker-compose ps
to check the ports that are mapped from the service ports inside of the container and then query through them.
Query the query number in the last 10 minutes in the Graph Service.
$ curl -G \"http://192.168.8.40:19669/stats?stats=num_queries.sum.600\"\nnum_queries.sum.600=400\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_multiple_metrics","title":"Query multiple metrics","text":"Query the following metrics together:
The average latency of the slowest 1% heartbeats, i.e., the P99 heartbeats, in the last 10 minutes.
$ curl -G \"http://192.168.8.40:19559/stats?stats=heartbeat_latency_us.avg.60,heartbeat_latency_us.p99.600\"\nheartbeat_latency_us.avg.60=281\nheartbeat_latency_us.p99.600=985\n
Query the number of new vertices in the Storage Service in the last 10 minutes and return the result in the JSON format.
$ curl -G \"http://192.168.8.40:19779/stats?stats=num_add_vertices.sum.600&format=json\"\n[{\"value\":1,\"name\":\"num_add_vertices.sum.600\"}]\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_all_metrics_in_a_service","title":"Query all metrics in a service.","text":"If no metric is specified in the query, NebulaGraph returns all metrics in the service.
$ curl -G \"http://192.168.8.40:19559/stats\"\nheartbeat_latency_us.avg.5=304\nheartbeat_latency_us.avg.60=308\nheartbeat_latency_us.avg.600=299\nheartbeat_latency_us.avg.3600=285\nheartbeat_latency_us.p75.5=652\nheartbeat_latency_us.p75.60=669\nheartbeat_latency_us.p75.600=651\nheartbeat_latency_us.p75.3600=642\nheartbeat_latency_us.p95.5=930\nheartbeat_latency_us.p95.60=963\nheartbeat_latency_us.p95.600=933\nheartbeat_latency_us.p95.3600=929\nheartbeat_latency_us.p99.5=986\nheartbeat_latency_us.p99.60=1409\nheartbeat_latency_us.p99.600=989\nheartbeat_latency_us.p99.3600=986\nnum_heartbeats.rate.5=0\nnum_heartbeats.rate.60=0\nnum_heartbeats.rate.600=0\nnum_heartbeats.rate.3600=0\nnum_heartbeats.sum.5=2\nnum_heartbeats.sum.60=40\nnum_heartbeats.sum.600=394\nnum_heartbeats.sum.3600=2364\n...\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#space-level_metrics","title":"Space-level metrics","text":"The Graph service supports a set of space-level metrics that record the information of different graph spaces separately.
Space-level metrics can be queried only by querying all metrics. For example, run curl -G \"http://192.168.8.40:19559/stats\"
to show all metrics. The returned result contains the graph space name in the form of '{space=space_name}', such as num_active_queries{space=basketballplayer}.sum.5=0
.
Caution
To enable space-level metrics, set the value of enable_space_level_metrics
to true
in the Graph service configuration file before starting NebulaGraph. For details about how to modify the configuration, see Configuration Management.
num_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"6.monitor-and-metrics/2.rocksdb-statistics/","title":"RocksDB statistics","text":"NebulaGraph uses RocksDB as the underlying storage. This topic describes how to collect and show the RocksDB statistics of NebulaGraph.
"},{"location":"6.monitor-and-metrics/2.rocksdb-statistics/#enable_rocksdb","title":"Enable RocksDB","text":"By default, the function of RocksDB statistics is disabled. To enable RocksDB statistics, you need to:
Modify the --enable_rocksdb_statistics
parameter as true
in the nebula-storaged.conf
file. The default path of the configuration file is /use/local/nebula/etc
.
Restart the service to make the modification valid.
Users can use the built-in HTTP service in the storage service to get the following types of statistics. Results in the JSON format are supported.
Use the following command to get all RocksDB statistics:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats\"\n
For example:
curl -L \"http://172.28.2.1:19779/rocksdb_stats\"\n\nrocksdb.blobdb.blob.file.bytes.read=0\nrocksdb.blobdb.blob.file.bytes.written=0\nrocksdb.blobdb.blob.file.bytes.synced=0\n...\n
Use the following command to get specified RocksDB statistics:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats?stats=${stats_name}\"\n
For example, use the following command to get the information of rocksdb.bytes.read
and rocksdb.block.cache.add
.
curl -L \"http://172.28.2.1:19779/rocksdb_stats?stats=rocksdb.bytes.read,rocksdb.block.cache.add\"\n\nrocksdb.block.cache.add=14\nrocksdb.bytes.read=1632\n
Use the following command to get specified RocksDB statistics in the JSON format:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats?stats=${stats_name}&format=json\"\n
For example, use the following command to get the information of rocksdb.bytes.read
and rocksdb.block.cache.add
and return the results in the JSON format.
curl -L \"http://172.28.2.1:19779/rocksdb_stats?stats=rocksdb.bytes.read,rocksdb.block.cache.add&format=json\"\n\n[\n {\n \"rocksdb.block.cache.add\": 1\n },\n {\n \"rocksdb.bytes.read\": 160\n }\n]\n
"},{"location":"7.data-security/4.ssl/","title":"SSL encryption","text":"NebulaGraph supports SSL encrypted transfers between the Client, Graph Service, Meta Service, and Storage Service, and this topic describes how to set up SSL encryption.
"},{"location":"7.data-security/4.ssl/#precaution","title":"Precaution","text":"Enabling SSL encryption will slightly affect the performance, such as causing operation latency.
"},{"location":"7.data-security/4.ssl/#certificate_modes","title":"Certificate modes","text":"To use SSL encryption, SSL certificates are required. NebulaGraph supports two certificate modes.
Self-signed certificate mode
A certificate that is generated by the server itself and signed by itself. In the self-signed certificate mode, the server needs to generate its own SSL certificate and key, and then use its own private key to sign the certificate. It is suitable for building secure communications for systems and applications within a LAN.
CA-signed certificate mode
A certificate granted by a trusted third-party Certificate Authority (CA). In the CA signed certificate mode, the server needs to apply for an SSL certificate from a trusted CA and ensure the authenticity and trustworthiness of the certificate through the auditing and signing of the certificate authority center. It is suitable for public network environment, especially for websites, e-commerce and other occasions that need to protect user information security.
Policies for the NebulaGraph community edition.
Scene TLS External device access to Graph Modify the Graph configuration file to add the following parameters:--enable_graph_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access Meta In the Graph/Meta configuration file, add the following parameters:--enable_meta_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_storage_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access Meta/StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_meta_ssl = true
--enable_storage_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
External device access to GraphGraph access Meta/StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
The parameters are described below.
Parameter Default value Descriptioncert_path
- The path to the SSL public key certificate. This certificate is usually a .pem
or .crt
file, which is used to prove the identity of the server side, and contains information such as the public key, certificate owner, digital signature, and so on. key_path
- The path to the SSL key. The SSL key is usually a .key
file. password_path
- (Optional) The path to the password file for the SSL key. Some SSL keys are encrypted and require a corresponding password to decrypt. We need to store the password in a separate file and use this parameter to specify the path to the password file. ca_path
- The path to the SSL root certificate. The root certificate is a special SSL certificate that is considered the highest level in the SSL trust chain and is used to validate and authorize other SSL certificates. enable_ssl
false
Whether to enable SSL encryption in all services. only. enable_graph_ssl
false
Whether to enable SSL encryption in the Graph service only. enable_meta_ssl
false
Whether to enable SSL encryption in the Meta service only. enable_storage_ssl
false
Whether to enable SSL encryption in the Storage service only."},{"location":"7.data-security/4.ssl/#example_of_tls","title":"Example of TLS","text":"For example, using self-signed certificates and TLS for data transfers between the client NebulaGraph Python, the Graph service, the Meta service, and the Storage service. You need to set up all three Graph/Meta/Storage configuration files as follows:
--enable_ssl=true\n--ca_path=xxxxxx\n--cert_path=xxxxxx\n--key_path=xxxxxx\n
When the changes are complete, restart these services to make the configuration take effect.
To connect to the Graph service using NebulaGraph Python, you need to set up a secure socket and add a trusted CA. For code examples, see nebula-test-run.py.
NebulaGraph replies on local authentication to implement access control.
NebulaGraph creates a session when a client connects to it. The session stores information about the connection, including the user information. If the authentication system is enabled, the session will be mapped to corresponding users.
Note
By default, the authentication is disabled and NebulaGraph allows connections with the username root
and any password.
Local authentication indicates that usernames and passwords are stored locally on the server, with the passwords encrypted. Users will be authenticated when trying to visit NebulaGraph.
"},{"location":"7.data-security/1.authentication/1.authentication/#enable_local_authentication","title":"Enable local authentication","text":"Modify the nebula-graphd.conf
file (/usr/local/nebula/etc/
is the default path) to set the following parameters:
--enable_authorize
: Set its value to true
to enable authentication.
Note
root
and any password.root
and password nebula
to log into NebulaGraph after enabling local authentication. This account has the build-in God role. For more information about roles, see Roles and privileges.--failed_login_attempts
: This parameter is optional, and you need to add this parameter manually. Specify the attempts of continuously entering incorrect passwords for a single Graph service. When the number exceeds the limitation, your account will be locked. For multiple Graph services, the allowed attempts are number of services * failed_login_attempts
.--password_lock_time_in_secs
: This parameter is optional, and you need to add this parameter manually. Specify the time how long your account is locked after multiple incorrect password entries are entered. Unit: second.Restart the NebulaGraph services. For how to restart, see Manage NebulaGraph services.
User management is an indispensable part of NebulaGraph access control. This topic describes how to manage users and roles.
After enabling authentication, only valid users can connect to NebulaGraph and access the resources according to the user roles.
Note
root
and any password.The root
user with the GOD role can run CREATE USER
to create a new user.
Syntax
CREATE USER [IF NOT EXISTS] <user_name> [WITH PASSWORD '<password>'];\n
IF NOT EXISTS
: Detects if the user name exists. The user will be created only if the user name does not exist.user_name
: Sets the name of the user. The maximum length is 16 characters.password
: Sets the password of the user. The default password is the empty string (''
). The maximum length is 24 characters.Example
nebula> CREATE USER user1 WITH PASSWORD 'nebula';\nnebula> SHOW USERS;\n+---------+-------------------------------+\n| Account | IP Whitelist |\n+---------+-------------------------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n+---------+-------------------------------+\n
Users with the GOD role or the ADMIN role can run GRANT ROLE
to assign a built-in role in a graph space to a user. For more information about NebulaGraph built-in roles, see Roles and privileges.
Syntax
GRANT ROLE <role_type> ON <space_name> TO <user_name>;\n
Example
nebula> GRANT ROLE USER ON basketballplayer TO user1;\n
Users with the GOD role or the ADMIN role can run REVOKE ROLE
to revoke the built-in role of a user in a graph space. For more information about NebulaGraph built-in roles, see Roles and privileges.
Syntax
REVOKE ROLE <role_type> ON <space_name> FROM <user_name>;\n
Example
nebula> REVOKE ROLE USER ON basketballplayer FROM user1;\n
Users can run DESCRIBE USER
to list the roles for a specified user.
Syntax
DESCRIBE USER <user_name>;\nDESC USER <user_name>;\n
Example
nebula> DESCRIBE USER user1;\n+---------+--------------------+\n| role | space |\n+---------+--------------------+\n| \"ADMIN\" | \"basketballplayer\" |\n+---------+--------------------+\n
Users can run SHOW ROLES
to list the roles in a graph space.
Syntax
SHOW ROLES IN <space_name>;\n
Example
nebula> SHOW ROLES IN basketballplayer;\n+---------+-----------+\n| Account | Role Type |\n+---------+-----------+\n| \"user1\" | \"ADMIN\" |\n+---------+-----------+\n
Users can run CHANGE PASSWORD
to set a new password for a user. The old password is needed when setting a new one.
Syntax
CHANGE PASSWORD <user_name> FROM '<old_password>' TO '<new_password>';\n
Example
nebula> CHANGE PASSWORD user1 FROM 'nebula' TO 'nebula123';\n
The root
user with the GOD role can run ALTER USER
to set a new password. The old password is not needed when altering the user.
Syntax
ALTER USER <user_name> WITH PASSWORD '<password>';\n
- Example nebula> ALTER USER user2 WITH PASSWORD 'nebula';\n
The root
user with the GOD role can run DROP USER
to remove a user.
Note
Removing a user does not close the current session of the user, and the user role still takes effect in the session until the session is closed.
Syntax
DROP USER [IF EXISTS] <user_name>;\n
Example
nebula> DROP USER user1;\n
The root
user with the GOD role can run SHOW USERS
to list all the users.
Syntax
SHOW USERS;\n
Example
nebula> SHOW USERS;\n+---------+-----------------+\n| Account | IP Whitelist |\n+---------+-----------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n| \"user2\" | \"192.168.10.10\" |\n+---------+-----------------+\n
A role is a collection of privileges. You can assign a role to a user for access control.
"},{"location":"7.data-security/1.authentication/3.role-list/#built-in_roles","title":"Built-in roles","text":"NebulaGraph does not support custom roles, but it has multiple built-in roles:
GOD
root
in Linux and administrator
in Windows.root
is automatically created with the password nebula
.Caution
Modify the password for root
timely for security.
When the --enable_authorize
parameter in the nebula-graphd.conf
file (the default directory is /usr/local/nebula/etc/
) is set to true
:
root
user with the default God role can be used.ADMIN
An ADMIN role of a graph space can grant DBA, USER, and GUEST roles in the graph space to other users.
Note
Only roles lower than ADMIN can be authorized to other users.
DBA
USER
Note
The privileges of roles and the nGQL statements that each role can use are listed as follows.
Privilege God Admin DBA User Guest Allowed nGQL Read space Y Y Y Y YUSE
, DESCRIBE SPACE
Read schema Y Y Y Y Y DESCRIBE TAG
, DESCRIBE EDGE
, DESCRIBE TAG INDEX
, DESCRIBE EDGE INDEX
Write schema Y Y Y Y CREATE TAG
, ALTER TAG
, CREATE EDGE
, ALTER EDGE
, DROP TAG
, DELETE TAG
, DROP EDGE
, CREATE TAG INDEX
, CREATE EDGE INDEX
, DROP TAG INDEX
, DROP EDGE INDEX
Write user Y CREATE USER
, DROP USER
, ALTER USER
Write role Y Y GRANT
, REVOKE
Read data Y Y Y Y Y GO
, SET
, PIPE
, MATCH
, ASSIGNMENT
, LOOKUP
, YIELD
, ORDER BY
, FETCH VERTICES
, Find
, FETCH EDGES
, FIND PATH
, LIMIT
, GROUP BY
, RETURN
Write data Y Y Y Y INSERT VERTEX
, UPDATE VERTEX
, INSERT EDGE
, UPDATE EDGE
, DELETE VERTEX
, DELETE EDGES
, DELETE TAG
Show operations Y Y Y Y Y SHOW
, CHANGE PASSWORD
Job Y Y Y Y SUBMIT JOB COMPACT
, SUBMIT JOB FLUSH
, SUBMIT JOB STATS
, STOP JOB
, RECOVER JOB
, BUILD TAG INDEX
, BUILD EDGE INDEX
,INGEST
, DOWNLOAD
Write space Y CREATE SPACE
, DROP SPACE
, CREATE SNAPSHOT
, DROP SNAPSHOT
, BALANCE
, CONFIG
Caution
SHOW
operations are limited to the role of a user. For example, all users can run SHOW SPACES
, but the results only include the graph spaces that the users have privileges.SHOW USERS
and SHOW SNAPSHOTS
.This topic provides general suggestions for modeling data in NebulaGraph.
Note
The following suggestions may not apply to some special scenarios. In these cases, find help in the NebulaGraph community.
"},{"location":"8.service-tuning/2.graph-modeling/#model_for_performance","title":"Model for performance","text":"There is no perfect method to model in Nebula\u00a0Graph. Graph modeling depends on the questions that you want to know from the data. Your data drives your graph model. Graph data modeling is intuitive and convenient. Create your data model based on your business model. Test your model and gradually optimize it to fit your business. To get better performance, you can change or re-design your model multiple times.
"},{"location":"8.service-tuning/2.graph-modeling/#design_and_evaluate_the_most_important_queries","title":"Design and evaluate the most important queries","text":"Usually, various types of queries are validated in test scenarios to assess the overall capabilities of the system. However, in most production scenarios, there are not many types of frequently used queries. You can optimize the data model based on key queries selected according to the Pareto (80/20) principle.
"},{"location":"8.service-tuning/2.graph-modeling/#full-graph_scanning_avoidance","title":"Full-graph scanning avoidance","text":"Graph traversal can be performed after one or more vertices/edges are located through property indexes or VIDs. But for some query patterns, such as subgraph and path query patterns, the source vertex or edge of the traversal cannot be located through property indexes or VIDs. These queries find all the subgraphs that satisfy the query pattern by scanning the whole graph space which will have poor query performance. NebulaGraph does not implement indexing for the graph structures of subgraphs or paths.
"},{"location":"8.service-tuning/2.graph-modeling/#no_predefined_bonds_between_tags_and_edge_types","title":"No predefined bonds between Tags and Edge types","text":"Define the bonds between Tags and Edge types in the application, not NebulaGraph. There are no statements that could get the bonds between Tags and Edge types.
"},{"location":"8.service-tuning/2.graph-modeling/#tagsedge_types_predefine_a_set_of_properties","title":"Tags/Edge types predefine a set of properties","text":"While creating Tags or Edge types, you need to define a set of properties. Properties are part of the NebulaGraph Schema.
"},{"location":"8.service-tuning/2.graph-modeling/#control_changes_in_the_business_model_and_the_data_model","title":"Control changes in the business model and the data model","text":"Changes here refer to changes in business models and data models (meta-information), not changes in the data itself.
Some graph databases are designed to be Schema-free, so their data modeling, including the modeling of the graph topology and properties, can be very flexible. Properties can be re-modeled to graph topology, and vice versa. Such systems are often specifically optimized for graph topology access.
NebulaGraph master is a strong-Schema (row storage) system, which means that the business model should not change frequently. For example, the property Schema should not change. It is similar to avoiding ALTER TABLE
in MySQL.
On the contrary, vertices and their edges can be added or deleted at low costs. Thus, the easy-to-change part of the business model should be transformed to vertices or edges, rather than properties.
For example, in a business model, people have relatively fixed properties such as age, gender, and name. But their contact, place of visit, trade account, and login device are often changing. The former is suitable for modeling as properties and the latter as vertices or edges.
"},{"location":"8.service-tuning/2.graph-modeling/#set_temporary_properties_through_self-loop_edges","title":"Set temporary properties through self-loop edges","text":"As a strong Schema system, NebulaGraph does not support List-type properties. And using ALTER TAG
costs too much. If you need to add some temporary properties or List-type properties to a vertex, you can first create an edge type with the required properties, and then insert one or more edges that direct to the vertex itself. The figure is as follows.
To retrieve temporary properties of vertices, fetch from self-loop edges. For example:
//Create the edge type and insert the loop property.\nnebula> CREATE EDGE IF NOT EXISTS temp(tmp int);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@1:(1);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@2:(2);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@3:(3);\n\n//After the data is inserted, you can query the loop property by general query statements, for example:\nnebula> GO FROM \"player100\" OVER temp YIELD properties(edge).tmp;\n+----------------------+\n| properties(EDGE).tmp |\n+----------------------+\n| 1 |\n| 2 |\n| 3 |\n+----------------------+\n\n//If you want the results to be returned in the form of a List, you can use a function, for example:\nnebula> MATCH (v1:player)-[e:temp]->() return collect(e.tmp);\n+----------------+\n| collect(e.tmp) |\n+----------------+\n| [1, 2, 3] |\n+----------------+\n
Operations on loops are not encapsulated with any syntactic sugars and you can use them just like those on normal edges."},{"location":"8.service-tuning/2.graph-modeling/#about_dangling_edges","title":"About dangling edges","text":"A dangling edge is an edge that only connects to a single vertex and only one part of the edge connects to the vertex.
In NebulaGraph master, dangling edges may appear in the following two cases.
Insert edges with INSERT EDGE statement before the source vertex or the destination vertex exists.
Delete vertices with DELETE VERTEX statement and the WITH EDGE
option is not used. At this time, the system does not delete the related outgoing and incoming edges of the vertices. There will be dangling edges by default.
Dangling edges may appear in NebulaGraph master as the design allow it to exist. And there is no MERGE statement like openCypher has. The existence of dangling edges depends entirely on the application level. You can use GO and LOOKUP statements to find a dangling edge, but cannot use the MATCH statement to find a dangling edge.
Examples:
// Insert an edge that connects two vertices which do not exist in the graph. The source vertex's ID is '11'. The destination vertex's ID is'13'. \n\nnebula> CREATE EDGE IF NOT EXISTS e1 (name string, age int);\nnebula> INSERT EDGE e1 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1);\n\n// Query using the `GO` statement\n\nnebula> GO FROM \"11\" over e1 YIELD properties(edge);\n+----------------------+\n| properties(EDGE) |\n+----------------------+\n| {age: 1, name: \"n1\"} |\n+----------------------+\n\n// Query using the `LOOKUP` statement\n\nnebula> LOOKUP ON e1 YIELD EDGE AS r;\n+-------------------------------------------------------+\n| r |\n+-------------------------------------------------------+\n| [:e2 \"11\"->\"13\" @0 {age: 1, name: \"n1\"}] |\n+-------------------------------------------------------+\n\n// Query using the `MATCH` statement\n\nnebula> MATCH ()-[e:e1]->() RETURN e;\n+---+\n| e |\n+---+\n+---+\nEmpty set (time spent 3153/3573 us)\n
"},{"location":"8.service-tuning/2.graph-modeling/#breadth-first_traversal_over_depth-first_traversal","title":"Breadth-first traversal over depth-first traversal","text":"person
and add properties name
, age
, and eye_color
to it. If you create a tag eye_color
and an edge type has
, and then create an edge to represent the eye color owned by the person, the traversal performance will not be high.(src)-[edge {P1, P2}]->(dst)
as (src)-[edge1]->(i_node {P1, P2})-[edge2]->(dst)
. With NebulaGraph master, you can use (src)-[edge {P1, P2}]->(dst)
directly to decrease the depth of the traversal and increase the performance.To query in the opposite direction of an edge, use the following syntax:
(dst)<-[edge]-(src)
or GO FROM dst REVERSELY
.
If you do not care about the directions or want to query against both directions, use the following syntax:
(src)-[edge]-(dst)
or GO FROM src BIDIRECT
.
Therefore, there is no need to insert the same edge redundantly in the reversed direction.
"},{"location":"8.service-tuning/2.graph-modeling/#set_tag_properties_appropriately","title":"Set tag properties appropriately","text":"Put a group of properties that are on the same level into the same tag. Different groups represent different concepts.
"},{"location":"8.service-tuning/2.graph-modeling/#use_indexes_correctly","title":"Use indexes correctly","text":"Using property indexes helps find VIDs through properties, but can lead to great performance reduction. Only use an index when you need to find vertices or edges through their properties.
"},{"location":"8.service-tuning/2.graph-modeling/#design_vids_appropriately","title":"Design VIDs appropriately","text":"See VID.
"},{"location":"8.service-tuning/2.graph-modeling/#long_texts","title":"Long texts","text":"Do not use long texts to create edge properties. Edge properties are stored twice and long texts lead to greater write amplification. For how edges properties are stored, see Storage architecture. It is recommended to store long texts in HBase or Elasticsearch and store its address in NebulaGraph.
"},{"location":"8.service-tuning/2.graph-modeling/#dynamic_graphs_sequence_graphs_are_not_supported","title":"Dynamic graphs (sequence graphs) are not supported","text":"In some scenarios, graphs need to have the time information to describe how the structure of the entire graph changes over time.1
The Rank field on Edges in NebulaGraph master can be used to store time in int64, but no field on vertices can do this because if you store the time information as property values, it will be covered by new insertion. Thus NebulaGraph does not support sequence graphs.
"},{"location":"8.service-tuning/2.graph-modeling/#free_graph_data_modeling_tools","title":"Free graph data modeling tools","text":"arrows.app
https://blog.twitter.com/engineering/en_us/topics/insights/2021/temporal-graph-networks\u00a0\u21a9
INSERT
.COMPACTION
and BALANCE
jobs to optimize data format and storage distribution at the right time.Preheat on the application side:
NebulaGraph master applies rule-based execution plans. Users cannot change execution plans, pre-compile queries (and corresponding plan cache), or accelerate queries by specifying indexes.
To view the execution plan and executive summary, see EXPLAIN and PROFILE.
"},{"location":"8.service-tuning/compaction/","title":"Compaction","text":"This topic gives some information about compaction.
In NebulaGraph, Compaction
is the most important background process and has an important effect on performance.
Compaction
reads the data that is written on the hard disk, then re-organizes the data structure and the indexes, and then writes back to the hard disk. The read performance can increase by times after compaction. Thus, to get high read performance, trigger compaction
(full compaction
) manually when writing a large amount of data into Nebula\u00a0Graph.
Note
Note that compaction
leads to long-time hard disk IO. We suggest that users do compaction during off-peak hours (for example, early morning).
NebulaGraph has two types of compaction
: automatic compaction
and full compaction
.
compaction
","text":"Automatic compaction
is automatically triggered when the system reads data, writes data, or the system restarts. The read performance can increase in a short time. Automatic compaction
is enabled by default. But once triggered during peak hours, it can cause unexpected IO occupancy that has an unwanted effect on the performance.
compaction
","text":"Full compaction
enables large-scale background operations for a graph space such as merging files, deleting the data expired by TTL. This operation needs to be initiated manually. Use the following statements to enable full compaction
:
Note
We recommend you to do the full compaction during off-peak hours because full compaction has a lot of IO operations.
nebula> USE <your_graph_space>;\nnebula> SUBMIT JOB COMPACT;\n
The preceding statement returns the job ID. To show the compaction
progress, use the following statement:
nebula> SHOW JOB <job_id>;\n
"},{"location":"8.service-tuning/compaction/#operation_suggestions","title":"Operation suggestions","text":"These are some operation suggestions to keep Nebula\u00a0Graph performing well.
SUBMIT JOB COMPACT
.SUBMIT JOB COMPACT
periodically during off-peak hours (e.g. early morning).To control the write traffic limitation for compactions
, set the following parameter in the nebula-storaged.conf
configuration file.
Note
This parameter limits the rate of all writes including normal writes and compaction writes.
# Limit the write rate to 20MB/s.\n--rocksdb_rate_limit=20 (in MB/s)\n
Compaction
stored?\"","text":"By default, the logs are stored under the LOG
file in the /usr/local/nebula/data/storage/nebula/{1}/data/
directory, or similar to LOG.old.1625797988509303
. You can find the following content.
** Compaction Stats [default] **\nLevel Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop\n----------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n L0 2/0 2.46 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.53 0.51 2 0.264 0 0\n Sum 2/0 2.46 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.53 0.51 2 0.264 0 0\n Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0\n
If the number of L0
files is large, the read performance will be greatly affected and compaction can be triggered.
compactions
for multiple graph spaces at the same time?\"","text":"Yes, you can. But the IO is much larger at this time and the efficiency may be affected.
"},{"location":"8.service-tuning/compaction/#how_much_time_does_it_take_for_full_compactions","title":"\"How much time does it take for fullcompactions
?\"","text":"When rocksdb_rate_limit
is set to 20
, you can estimate the full compaction time by dividing the hard disk usage by the rocksdb_rate_limit
. If you do not set the rocksdb_rate_limit
value, the empirical value is around 50 MB/s.
--rocksdb_rate_limit
dynamically?\"","text":"No, you cannot.
"},{"location":"8.service-tuning/compaction/#can_i_stop_a_full_compaction_after_it_starts","title":"\"Can I stop a fullcompaction
after it starts?\"","text":"No, you cannot. When you start a full compaction, you have to wait till it is done. This is the limitation of RocksDB.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/","title":"Enable AutoFDO for NebulaGraph","text":"The AutoFDO can analyze the performance of an optimized program and use the program's performance information to guide the compiler to re-optimize the program. This document will help you to enable the AutoFDO for NebulaGraph.
More information about the AutoFDO, please refer AutoFDO Wiki.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#resource_preparations","title":"Resource Preparations","text":""},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#install_dependencies","title":"Install Dependencies","text":"Install perf
sudo apt-get update\nsudo apt-get install -y linux-tools-common \\\nlinux-tools-generic \\\nlinux-tools-`uname -r`\n
Install autofdo tool
sudo apt-get update\nsudo apt-get install -y autofdo\n
Or you can compile the autofdo tool from source.
For how to build NebulaGraph from source, please refer to the official document: Install NebulaGraph by compiling the source code. In the configure step, replace CMAKE_BUILD_TYPE=Release
with CMAKE_BUILD_TYPE=RelWithDebInfo
as below:
$ cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo ..\n
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#prepare_test_data","title":"Prepare Test Data","text":"In our test environment, we use NebulaGraph Bench to prepare the test data and collect the profile data by running the FindShortestPath, Go1Step, Go2Step, Go3Step, InsertPersonScenario 5 scenarios.
Note
You can use your TopN queries in your production environment to collect the profile data, the performance can gain more in your environment.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#prepare_profile_data","title":"Prepare Profile Data","text":""},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#collect_perf_data_for_autofdo_tool","title":"Collect Perf Data For AutoFdo Tool","text":"After the test data preparation work done. Collect the perf data for different scenarios. Get the pid of storaged
, graphd
, metad
.
$ nebula.service status all\n[INFO] nebula-metad: Running as 305422, Listening on 9559\n[INFO] nebula-graphd: Running as 305516, Listening on 9669\n[INFO] nebula-storaged: Running as 305707, Listening on 9779\n
Start the perf record for nebula-graphd and nebula-storaged.
perf record -p 305516,305707 -b -e br_inst_retired.near_taken:pp -o ~/FindShortestPath.data\n
Note
Because the nebula-metad
service contribution percent is small compared with nebula-graphd
and nebula-storaged
services. To reduce effort, we didn't collect the perf data for nebula-metad
service.
Start the benchmark test for FindShortestPath scenario.
cd NebulaGraph-Bench \npython3 run.py stress run -s benchmark -scenario find_path.FindShortestPath -a localhost:9669 --args='-u 100 -i 100000'\n
After the benchmark finished, end the perf record by Ctrl + c.
Repeat above steps to collect corresponding profile data for the rest Go1Step, Go2Step, Go3Step and InsertPersonScenario scenarios.
create_gcov --binary=$NEBULA_HOME/bin/nebula-storaged \\\n--profile=~/FindShortestPath.data \\\n--gcov=~/FindShortestPath-storaged.gcov \\\n-gcov_version=1\n\ncreate_gcov --binary=$NEBULA_HOME/bin/nebula-graphd \\\n--profile=~/FindShortestPath.data \\\n--gcov=~/FindShortestPath-graphd.gcov \\\n-gcov_version=1\n
Repeat for Go1Step, Go2Step, Go3Step and InsertPersonScenario scenarios.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#merge_the_profile_data","title":"Merge the Profile Data","text":"profile_merger ~/FindShortestPath-graphd.gcov \\\n~/FindShortestPath-storaged.gcov \\\n~/go1step-storaged.gcov \\\n~/go1step-graphd.gcov \\\n~/go2step-storaged.gcov \\\n~/go2step-graphd.gcov \\\n~/go3step-storaged.gcov \\\n~/go3step-master-graphd.gcov \\\n~/InsertPersonScenario-storaged.gcov \\\n~/InsertPersonScenario-graphd.gcov\n
You will get a merged profile which is named fbdata.afdo
after that.
Recompile the GraphNebula Binary by passing the profile with compile option -fauto-profile
.
diff --git a/cmake/nebula/GeneralCompilerConfig.cmake b/cmake/nebula/GeneralCompilerConfig.cmake\n@@ -20,6 +20,8 @@ add_compile_options(-Wshadow)\n add_compile_options(-Wnon-virtual-dtor)\n add_compile_options(-Woverloaded-virtual)\n add_compile_options(-Wignored-qualifiers)\n+add_compile_options(-fauto-profile=~/fbdata.afdo)\n
Note
When you use multiple fbdata.afdo to compile multiple times, please remember to make clean
before re-compile, baucase only change the fbdata.afdo will not trigger re-compile.
You can use the SUBMIT JOB BALANCE
statement to balance the distribution of partitions and Raft leaders, or clear some Storage servers for easy maintenance. For details, see SUBMIT JOB BALANCE.
Danger
The BALANCE
commands migrate data and balance the distribution of partitions by creating and executing a set of subtasks. DO NOT stop any machine in the cluster or change its IP address until all the subtasks finish. Otherwise, the follow-up subtasks fail.
To balance the raft leaders, run SUBMIT JOB BALANCE LEADER
. It will start a job to balance the distribution of all the storage leaders in all graph spaces.
nebula> SUBMIT JOB BALANCE LEADER;\n
Run SHOW HOSTS
to check the balance result.
nebula> SHOW HOSTS;\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n| \"192.168.10.101\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3\" | \"basketballplayer:8\" | \"master\" |\n| \"192.168.10.102\" | 9779 | \"ONLINE\" | 3 | \"basketballplayer:3\" | \"basketballplayer:8\" | \"master\" |\n| \"192.168.10.103\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n| \"192.168.10.104\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n| \"192.168.10.105\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n
Caution
During leader partition replica switching in NebulaGraph, the leader replicas will be temporarily prohibited from being written to until the switch is completed. If there are a large number of write requests during the switching period, it will result in a request error (Storage Error E_RPC_FAILURE
). See FAQ for error handling methods.
You can set the value of raft_heartbeat_interval_secs
in the Storage configuration file to control the timeout period for leader replica switching. For more information on the configuration file, see Storage configuration file.
NebulaGraph is used in a variety of industries. This topic presents a few best practices for using NebulaGraph. For more best practices, see Blog.
"},{"location":"8.service-tuning/practice/#scenarios","title":"Scenarios","text":"In graph theory, a super vertex, also known as a dense vertex, is a vertex with an extremely high number of adjacent edges. The edges can be outgoing or incoming.
Super vertices are very common because of the power-law distribution. For example, popular leaders in social networks (Internet celebrities), top stocks in the stock market, Big Four in the banking system, hubs in transportation networks, websites with high clicking rates on the Internet, and best sellers in E-commerce.
In NebulaGraph master, a vertex
and its properties
form a key-value pair
, with its VID
and other meta information as the key
. Its Out-Edge Key-Value
and In-Edge Key-Value
are stored in the same partition in the form of LSM-trees in hard disks and caches.
Therefore, directed traversals from this vertex
and directed traversals ending at this vertex
both involve either a large number of sequential IO scans
(ideally, after Compaction or a large number of random IO
(frequent writes to the vertex
and its ingoing and outgoing edges
).
As a rule of thumb, a vertex is considered dense when the number of its edges exceeds 10,000. Some special cases require additional consideration.
Note
In NebulaGraph master, there is not any data structure to store the out/in degree for each vertex. Therefore, there is no direct method to know whether it is a super vertex or not. You can try to use Spark to count the degrees periodically.
"},{"location":"8.service-tuning/super-node/#indexes_for_duplicate_properties","title":"Indexes for duplicate properties","text":"In a property graph, there is another class of cases similar to super vertices: a property has a very high duplication rate, i.e., many vertices with the same tag
but different VIDs
have identical property and property values.
Property indexes in NebulaGraph master are designed to reuse the functionality of RocksDB in the Storage Service, in which case indexes are modeled as keys with the same prefix
. If the lookup of a property fails to hit the cache, it is processed as a random seek and a sequential prefix scan on the hard disk to find the corresponding VID. After that, the graph is usually traversed from this vertex, so that another random read and sequential scan for the corresponding key-value of this vertex will be triggered. The higher the duplication rate, the larger the scan range.
For more information about property indexes, see How indexing works in NebulaGraph.
Usually, special design and processing are required when the number of duplicate property values exceeds 10,000.
"},{"location":"8.service-tuning/super-node/#suggested_solutions","title":"Suggested solutions","text":""},{"location":"8.service-tuning/super-node/#solutions_at_the_database_end","title":"Solutions at the database end","text":"Break up some of the super vertices according to their business significance:
Delete multiple edges and merge them into one.
For example, in the transfer scenario (Account_A)-[TRANSFER]->(Account_B)
, each transfer record is modeled as an edge between account A and account B, then there may be tens of thousands of transfer records between (Account_A)
and (Account_B)
.
In such scenarios, merge obsolete transfer details on a daily, weekly, or monthly basis. That is, batch-delete old edges and replace them with a small number of edges representing monthly total
and times
. And keep the transfer details of the latest month.
Split an edge into multiple edges of different types.
For example, in the (Airport)<-[DEPART]-(Flight)
scenario, the departure of each flight is modeled as an edge between a flight and an airport. Departures from a big airport might be enormous.
According to different airlines, divide the DEPART
edge type into finer edge types, such as DEPART_CEAIR
, DEPART_CSAIR
, etc. Specify the departing airline in queries (graph traversal).
Split vertices.
For example, in the loan network (person)-[BORROW]->(bank)
, large bank A will have a very large number of loans and borrowers.
In such scenarios, you can split the large vertex A into connected sub-vertices A1, A2, and A3.
(Person1)-[BORROW]->(BankA1), (Person2)-[BORROW]->(BankA2), (Person2)-[BORROW]->(BankA3);\n(BankA1)-[BELONGS_TO]->(BankA), (BankA2)-[BELONGS_TO]->(BankA), (BankA3)-[BELONGS_TO]->(BankA).\n
A1, A2, and A3 can either be three real branches of bank A, such as Beijing branch, Shanghai branch, and Zhejiang branch, or three virtual branches set up according to certain rules, such as A1: 1-1000, A2: 1001-10000 and A3: 10000+
according to the number of loans. In this way, any operation on A is converted into three separate operations on A1, A2, and A3.
NebulaGraph supports using snapshots to back up and restore data. When data loss or misoperation occurs, the data will be restored through the snapshot.
"},{"location":"backup-and-restore/3.manage-snapshot/#prerequisites","title":"Prerequisites","text":"NebulaGraph authentication is disabled by default. In this case, all users can use the snapshot feature.
If authentication is enabled, only the GOD role user can use the snapshot feature. For more information about roles, see Roles and privileges.
"},{"location":"backup-and-restore/3.manage-snapshot/#precautions","title":"Precautions","text":"ADD HOST
, DROP HOST
, CREATE SPACE
, DROP SPACE
, and BALANCE
are performed.DROP SNAPSHOT
.Run CREATE SNAPSHOT
to create a snapshot for all the graph spaces based on the current time for NebulaGraph. Creating a snapshot for a specific graph space is not supported yet.
Note
If the creation fails, refer to the later section to delete the corrupted snapshot and then recreate the snapshot.
nebula> CREATE SNAPSHOT;\n
"},{"location":"backup-and-restore/3.manage-snapshot/#view_snapshots","title":"View snapshots","text":"To view all existing snapshots, run SHOW SNAPSHOTS
.
nebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+------------------+\n| \"SNAPSHOT_2021_03_09_08_43_12\" | \"VALID\" | \"127.0.0.1:9779\" |\n| \"SNAPSHOT_2021_03_09_09_10_52\" | \"VALID\" | \"127.0.0.1:9779\" |\n+--------------------------------+---------+------------------+\n
The parameters in the return information are described as follows.
Parameter DescriptionName
The name of the snapshot directory. The prefix SNAPSHOT
indicates that the file is a snapshot file, and the suffix indicates the time the snapshot was created (UTC). Status
The status of the snapshot. VALID
indicates that the creation succeeded, while INVALID
indicates that it failed. Hosts
The IPs (or hostnames) and ports of all Storage servers at the time the snapshot was created."},{"location":"backup-and-restore/3.manage-snapshot/#snapshot_path","title":"Snapshot path","text":"Snapshots are stored in the path specified by the data_path
parameter in the Meta and Storage configuration files. When a snapshot is created, the checkpoints
directory is checked in the datastore path of the leader Meta service and all Storage services for the existence, and if it is not there, it is automatically created. The newly created snapshot is stored as a subdirectory within the checkpoints
directory. For example, SNAPSHOT_2021_03_09_08_43_12
. The suffix 2021_03_09_08_43_12
is generated automatically based on the creation time (UTC).
To fast locate the path where the snapshots are stored, you can use the Linux command find
in the datastore path. For example:
$ cd /usr/local/nebula-graph-ent-master/data\n$ find |grep 'SNAPSHOT_2021_03_09_08_43_12'\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12/data\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12/data/000081.sst\n...\n
"},{"location":"backup-and-restore/3.manage-snapshot/#delete_snapshots","title":"Delete snapshots","text":"To delete a snapshot with the given name, run DROP SNAPSHOT
.
DROP SNAPSHOT <snapshot_name>;\n
Example:
nebula> DROP SNAPSHOT SNAPSHOT_2021_03_09_08_43_12;\nnebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+------------------+\n| \"SNAPSHOT_2021_03_09_09_10_52\" | \"VALID\" | \"127.0.0.1:9779\" |\n+--------------------------------+---------+------------------+\n
Note
Deleting the only snapshot within the checkpoints
directory also deletes the checkpoints
directory.
Warning
When you restore data with snapshots, make sure that the graph spaces backed up in the snapshot have not been dropped. Otherwise, the data of the graph spaces cannot be restored.
Currently, there is no command to restore data with snapshots. You need to manually copy the snapshot file to the corresponding folder, or you can make it by using a shell script. The logic implements as follows:
After the snapshot is created, the checkpoints
directory is generated in the installation directory of the leader Meta service and all Storage services, and saves the created snapshot. Taking this topic as an example, when there are two graph spaces, the snapshots created are saved in /usr/local/nebula/data/meta/nebula/0/checkpoints
, /usr/local/nebula/data/storage/ nebula/3/checkpoints
and /usr/local/nebula/data/storage/nebula/4/checkpoints
.
$ ls /usr/local/nebula/data/meta/nebula/0/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n$ ls /usr/local/nebula/data/storage/nebula/3/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n$ ls /usr/local/nebula/data/storage/nebula/4/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n
To restore the lost data through snapshots, you can take a snapshot at an appropriate time, copy the folders data
and wal
in the corresponding snapshot directory to its parent directory (at the same level with checkpoints
) to overwrite the previous data
and wal
, and then restart the cluster.
Warning
The data and wal directories of all Meta services should be overwritten at the same time. Otherwise, the new leader Meta service will use the latest Meta data after a cluster is restarted.
Backup & Restore (BR for short) is a Command-Line Interface (CLI) tool to back up data of graph spaces of NebulaGraph and to restore data from the backup files.
"},{"location":"backup-and-restore/nebula-br/1.what-is-br/#features","title":"Features","text":"The BR has the following features. It supports:
To use the BR, follow these steps:
This topic introduces the installation of BR in bare-metal deployment scenarios.
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#notes","title":"Notes","text":"To use the BR (Community Edition) tool, you need to install the NebulaGraph Agent service, which is taken as a daemon for each machine in the cluster that starts and stops the NebulaGraph service, and uploads and downloads backup files. The BR (Community Edition) tool and the Agent plug-in are installed as described below.
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#version_compatibility","title":"Version compatibility","text":"NebulaGraph BR Agent 3.5.x ~ 3.6.0 3.6.0 3.6.x ~ 3.7.0 3.3.0 ~ 3.4.x 3.3.0 0.2.0 ~ 3.4.0 3.0.x ~ 3.2.x 0.6.1 0.1.0 ~ 0.2.0"},{"location":"backup-and-restore/nebula-br/2.compile-br/#install_br_with_a_binary_file","title":"Install BR with a binary file","text":"Install BR.
wget https://github.com/vesoft-inc/nebula-br/releases/download/v3.6.0/br-3.6.0-linux-amd64\n
Change the binary file name to br
.
sudo mv br-3.6.0-linux-amd64 br\n
Grand execute permission to BR.
sudo chmod +x br\n
Run ./br version
to check BR version.
[nebula-br]$ ./br version\nNebula Backup And Restore Utility Tool,V-3.6.0\n
Before compiling the BR, do a check of these:
To compile the BR, follow these steps:
Clone the nebula-br
repository to your machine.
git clone https://github.com/vesoft-inc/nebula-br.git\n
Change to the br
directory.
cd nebula-br\n
Compile the BR.
make\n
Users can enter bin/br version
on the command line. If the following results are returned, the BR is compiled successfully.
[nebula-br]$ bin/br version\nNebulaGraph Backup And Restore Utility Tool,V-3.6.0\n
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#install_agent","title":"Install Agent","text":"NebulaGraph Agent is installed as a binary file in each machine and serves the BR tool with the RPC protocol.
In each machine, follow these steps:
Install Agent.
wget https://github.com/vesoft-inc/nebula-agent/releases/download/v3.7.0/agent-3.7.0-linux-amd64\n
Rename the Agent file to agent
.
sudo mv agent-3.7.0-linux-amd64 agent\n
Add execute permission to Agent.
sudo chmod +x agent\n
Start Agent.
Note
Before starting Agent, make sure that the Meta service has been started and Agent has read and write access to the corresponding NebulaGraph cluster directory and backup directory.
sudo nohup ./agent --agent=\"<agent_node_ip>:8888\" --meta=\"<metad_node_ip>:9559\" --ratelimit=<file_size_bt> > nebula_agent.log 2>&1 &\n
--agent
: The IP address and port number of Agent.--meta
: The IP address and access port of any Meta service in the cluster.--ratelimit
: (Optional) Limits the speed of file uploads and downloads to prevent bandwidth from being filled up and making other services unavailable. Unit: Bytes.For example:
sudo nohup ./agent --agent=\"192.168.8.129:8888\" --meta=\"192.168.8.129:9559\" --ratelimit=1048576 > nebula_agent.log 2>&1 &\n
Caution
The IP address format for --agent
should be the same as that of Meta and Storage services set in the configuration files. That is, use the real IP addresses or use 127.0.0.1
. Otherwise Agent does not run.
Log into NebulaGraph and then run the following command to view the status of Agent.
nebula> SHOW HOSTS AGENT;\n+-----------------+------+----------+---------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-----------------+------+----------+---------+--------------+---------+\n| \"192.168.8.129\" | 8888 | \"ONLINE\" | \"AGENT\" | \"96646b8\" | |\n+-----------------+------+----------+---------+--------------+---------+ \n
If you encounter E_LIST_CLUSTER_NO_AGENT_FAILURE
error, it may be due to the Agent service is not started or the Agent service is not registered to Meta service. First, execute SHOW HOSTS AGENT
to check the status of the Agent service on all nodes in the cluster, when the status shows OFFLINE
, it means the registration of Agent failed, then check whether the value of the --meta
option in the command to start the Agent service is correct.
After the BR is installed, you can back up data of the entire graph space. This topic introduces how to use the BR to back up data.
"},{"location":"backup-and-restore/nebula-br/3.br-backup-data/#prerequisites","title":"Prerequisites","text":"To back up data with the BR, do a check of these:
If you store the backup files locally, create a directory with the same absolute path on the meta servers, the storage servers, and the BR machine for the backup files and get the absolute path. Make sure the account has write privileges for this directory.
Warning
In the production environment, we recommend that you mount Network File System (NFS) storage to the meta servers, the storage servers, and the BR machine for local backup, or use Amazon S3 or Alibaba Cloud OSS for remote backup. When you restore the data from local files, you must manually move these backup files to a specified directory, which causes redundant data and troubles. For more information, see Restore data from backup files.
In the BR installation directory (the default path of the compiled BR is ./bin/br
), run the following command to perform a full backup for the entire cluster.
Note
Make sure that the local path where the backup file is stored exists.
$ ./br backup full --meta <ip_address> --storage <storage_path>\n
For example:
Run the following command to perform a full backup for the entire cluster whose meta service address is 192.168.8.129:9559
, and save the backup file to /home/nebula/backup/
.
Caution
If there are multiple metad addresses, you can use any one of them.
Caution
If you back up data to a local disk, only the data of the leader metad is backed up by default. So if there are multiple metad processes, you need to manually copy the directory of the leader metad (path <storage_path>/meta
) and overwrite the corresponding directory of other follower meatd processes.
$ ./br backup full --meta \"192.168.8.129:9559\" --storage \"local:///home/nebula/backup/\"\n
Run the following command to perform a full backup for the entire cluster whose meta service address is 192.168.8.129:9559
, and save the backup file to backup
in the br-test
bucket of the object storage service compatible with S3 protocol.
$ ./br backup full --meta \"192.168.8.129:9559\" --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=default\n
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. --debug
- No None Checks for more log information. --log
string No \"br.log\"
Specifies detailed log path for restoration and backup. --meta
string Yes None The IP address and port of the meta service. --space
string Yes None (Experimental feature) Specifies the names of the spaces to be backed up. All spaces will be backed up if not specified. Multiple spaces can be specified, and format is --spaces nba_01 --spaces nba_02
. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID."},{"location":"backup-and-restore/nebula-br/3.br-backup-data/#next_to_do","title":"Next to do","text":"After the backup files are generated, you can use the BR to restore them for NebulaGraph. For more information, see Use BR to restore data.
"},{"location":"backup-and-restore/nebula-br/4.br-restore-data/","title":"Use BR to restore data","text":"If you use the BR to back up data, you can use it to restore the data to NebulaGraph. This topic introduces how to use the BR to restore data from backup files.
Caution
During the restoration process, the data on the target NebulaGraph cluster is removed and then is replaced with the data from the backup files. If necessary, back up the data on the target cluster.
Caution
The restoration process is performed OFFLINE.
"},{"location":"backup-and-restore/nebula-br/4.br-restore-data/#prerequisites","title":"Prerequisites","text":"In the BR installation directory (the default path of the compiled BR is ./br
), run the following command to perform a full backup for the entire cluster.
Users can use the following command to list the existing backup information:
$ ./br show --storage <storage_path>\n
For example, run the following command to list the backup information in the local /home/nebula/backup
path. $ ./br show --storage \"local:///home/nebula/backup\"\n+----------------------------+---------------------+------------------------+-------------+------------+\n| NAME | CREATE TIME | SPACES | FULL BACKUP | ALL SPACES |\n+----------------------------+---------------------+------------------------+-------------+------------+\n| BACKUP_2022_02_10_07_40_41 | 2022-02-10 07:40:41 | basketballplayer | true | true |\n| BACKUP_2022_02_11_08_26_43 | 2022-02-11 08:26:47 | basketballplayer,foesa | true | true |\n+----------------------------+---------------------+------------------------+-------------+------------+\n
Or, you can run the following command to list the backup information stored in S3 URL s3://192.168.8.129:9000/br-test/backup
.
$ ./br show --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=default\n
Parameter Data type Required Default value Description -h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. --storage
string Yes None The target storage URL of BR backup data. The format is: <Schema>://<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. Run the following command to restore data.
$ ./br restore full --meta <ip_address> --storage <storage_path> --name <backup_name>\n
For example, run the following command to upload the backup files from the local /home/nebula/backup/
to the cluster where the meta service's address is 192.168.8.129:9559
.
$ ./br restore full --meta \"192.168.8.129:9559\" --storage \"local:///home/nebula/backup/\" --name BACKUP_2021_12_08_18_38_08\n
Or, you can run the following command to upload the backup files from the S3 URL s3://192.168.8.129:9000/br-test/backup
.
$ ./br restore full --meta \"192.168.8.129:9559\" --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=\"default\" --name BACKUP_2021_12_08_18_38_08\n
If the following information is returned, the data is restored successfully.
Restore succeed.\n
Caution
If your new cluster hosts' IPs are not all the same as the backup cluster, after restoration, you should run add hosts
to add the Storage host IPs in the new cluster one by one.
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. -meta
string Yes None The IP address and port of the meta service. -name
string Yes None The name of backup. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. Run the following command to clean up temporary files if any error occurred during backup. It will clean the files in cluster and external storage. You could also use it to clean up old backups files in external storage.
$ ./br cleanup --meta <ip_address> --storage <storage_path> --name <backup_name>\n
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. -meta
string Yes None The IP address and port of the meta service. -name
string Yes None The name of backup. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. NebulaGraph Flink Connector is a connector that helps Flink users quickly access NebulaGraph. NebulaGraph Flink Connector supports reading data from the NebulaGraph database or writing other external data to the NebulaGraph database.
For more information, see NebulaGraph Flink Connector.
"},{"location":"connector/nebula-flink-connector/#use_cases","title":"Use cases","text":"NebulaGraph Flink Connector applies to the following scenarios:
Release
"},{"location":"connector/nebula-flink-connector/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Flink Connector version and the NebulaGraph core version is as follows.
Flink Connector version NebulaGraph version 3.0-SNAPSHOT nightly 3.5.0 3.x.x 3.3.0 3.x.x 3.0.0 3.x.x 2.6.1 2.6.0, 2.6.1 2.6.0 2.6.0, 2.6.1 2.5.0 2.5.0, 2.5.1 2.0.0 2.0.0, 2.0.1"},{"location":"connector/nebula-flink-connector/#prerequisites","title":"Prerequisites","text":"Add the following dependency to the Maven configuration file pom.xml
to automatically obtain the Flink Connector.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>nebula-flink-connector</artifactId>\n <version>3.5.0</version>\n</dependency>\n
"},{"location":"connector/nebula-flink-connector/#compile_and_package","title":"Compile and package","text":"Follow the steps below to compile and package the Flink Connector.
Clone repository nebula-flink-connector
.
$ git clone -b release-3.5 https://github.com/vesoft-inc/nebula-flink-connector.git\n
Enter the nebula-flink-connector
directory.
Compile and package.
$ mvn clean package -Dmaven.test.skip=true\n
After compilation, a file similar to nebula-flink-connector-3.5.0.jar
is generated in the directory connector/target
of the folder.
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();\nNebulaClientOptions nebulaClientOptions = new NebulaClientOptions.NebulaClientOptionsBuilder()\n .setGraphAddress(\"127.0.0.1:9669\")\n .setMetaAddress(\"127.0.0.1:9559\")\n .build();\nNebulaGraphConnectionProvider graphConnectionProvider = new NebulaGraphConnectionProvider(nebulaClientOptions);\nNebulaMetaConnectionProvider metaConnectionProvider = new NebulaMetaConnectionProvider(nebulaClientOptions);\n\nVertexExecutionOptions executionOptions = new VertexExecutionOptions.ExecutionOptionBuilder()\n .setGraphSpace(\"flinkSink\")\n .setTag(\"player\")\n .setIdIndex(0)\n .setFields(Arrays.asList(\"name\", \"age\"))\n .setPositions(Arrays.asList(1, 2))\n .setBatchSize(2)\n .build();\n\nNebulaVertexBatchOutputFormat outputFormat = new NebulaVertexBatchOutputFormat(\n graphConnectionProvider, metaConnectionProvider, executionOptions);\nNebulaSinkFunction<Row> nebulaSinkFunction = new NebulaSinkFunction<>(outputFormat);\nDataStream<Row> dataStream = playerSource.map(row -> {\n Row record = new org.apache.flink.types.Row(row.size());\n for (int i = 0; i < row.size(); i++) {\n record.setField(i, row.get(i));\n }\n return record;\n });\ndataStream.addSink(nebulaSinkFunction);\nenv.execute(\"write nebula\")\n
"},{"location":"connector/nebula-flink-connector/#read_data_from_nebulagraph","title":"Read data from NebulaGraph","text":"NebulaClientOptions nebulaClientOptions = new NebulaClientOptions.NebulaClientOptionsBuilder()\n .setMetaAddress(\"127.0.0.1:9559\")\n .build();\nstorageConnectionProvider = new NebulaStorageConnectionProvider(nebulaClientOptions);\nStreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();\nenv.setParallelism(1);\n\nVertexExecutionOptions vertexExecutionOptions = new VertexExecutionOptions.ExecutionOptionBuilder()\n .setGraphSpace(\"flinkSource\")\n .setTag(\"person\")\n .setNoColumn(false)\n .setFields(Arrays.asList())\n .setLimit(100)\n .build();\n\nNebulaSourceFunction sourceFunction = new NebulaSourceFunction(storageConnectionProvider)\n .setExecutionOptions(vertexExecutionOptions);\nDataStreamSource<BaseTableRow> dataStreamSource = env.addSource(sourceFunction);\ndataStreamSource.map(row -> {\n List<ValueWrapper> values = row.getValues();\n Row record = new Row(15);\n record.setField(0, values.get(0).asLong());\n record.setField(1, values.get(1).asString());\n record.setField(2, values.get(2).asString());\n record.setField(3, values.get(3).asLong());\n record.setField(4, values.get(4).asLong());\n record.setField(5, values.get(5).asLong());\n record.setField(6, values.get(6).asLong());\n record.setField(7, values.get(7).asDate());\n record.setField(8, values.get(8).asDateTime().getUTCDateTimeStr());\n record.setField(9, values.get(9).asLong());\n record.setField(10, values.get(10).asBoolean());\n record.setField(11, values.get(11).asDouble());\n record.setField(12, values.get(12).asDouble());\n record.setField(13, values.get(13).asTime().getUTCTimeStr());\n record.setField(14, values.get(14).asGeography());\n return record;\n}).print();\nenv.execute(\"NebulaStreamSource\");\n
"},{"location":"connector/nebula-flink-connector/#parameter_descriptions","title":"Parameter descriptions","text":"NebulaClientOptions
is the configuration for connecting to NebulaGraph, as described below.
setGraphAddress
String Yes The Graph service address of NebulaGraph. setMetaAddress
String Yes The Meta service address of NebulaGraph. VertexExecutionOptions
is the configuration for reading vertices from and writing vertices to NebulaGraph, as described below.
setGraphSpace
String Yes The graph space name. setTag
String Yes The tag name. setIdIndex
Int Yes The subscript of the stream data field that is used as the VID when writing data to NebulaGraph. setFields
List Yes A collection of the property names of a tag. It is used to write data to or read data from NebulaGraph. Make sure the setNoColumn
is false
when reading data; otherwise, the configuration is invalid. If this parameter is empty, all properties are read when reading data from NebulaGraph. setPositions
List Yes A collection of the subscripts of the stream data fields. It indicates that the corresponding field values are written to NebulaGraph as property values. This parameter needs to correspond to setFields
. setBatchSize
String No The maximum number of data records to write to NebulaGraph at a time. The default value is 2000
. setNoColumn
String No The properties are not to be read if set to true
when reading data. The default value is false
. setLimit
String No The maximum number of data records to pull at a time when reading data. The default value is 2000
. EdgeExecutionOptions
is the configuration for reading edges from and writing edges to NebulaGraph, as described below.
setGraphSpace
String Yes The graph space name. setEdge
String Yes The edge type name. setSrcIndex
Int Yes The subscript of the stream data field that is used as the VID of the source vertex when writing data to NebulaGraph. setDstIndex
Int Yes The subscript of the stream data field that is used as the VID of the destination vertex when writing data to NebulaGraph. setRankIndex
Int Yes The subscript of the stream data field that is used as the rank of the edge when writing data to NebulaGraph. setFields
List Yes A collection of the property names of an edge type. It is used to write data to or read data from NebulaGraph. Make sure the setNoColumn
is false
when reading data; otherwise, the configuration is invalid. If this parameter is empty, all properties are read when reading data from NebulaGraph. setPositions
List Yes A collection of the subscripts of the stream data fields. It indicates that the corresponding field values are written to NebulaGraph as property values. This parameter needs to correspond to setFields
. setBatchSize
String No The maximum number of data records to write to NebulaGraph at a time. The default value is 2000
. setNoColumn
String No The properties are not to be read if set to true
when reading data. The default value is false
. setLimit
String No The maximum number of data records to pull at a time when reading data. The default value is 2000
. Create a graph space.
NebulaCatalog nebulaCatalog = NebulaCatalogUtils.createNebulaCatalog(\n \"NebulaCatalog\",\n \"default\",\n \"root\",\n \"nebula\",\n \"127.0.0.1:9559\",\n \"127.0.0.1:9669\");\n\nEnvironmentSettings settings = EnvironmentSettings.newInstance()\n .inStreamingMode()\n .build();\nTableEnvironment tableEnv = TableEnvironment.create(settings);\n\ntableEnv.registerCatalog(CATALOG_NAME, nebulaCatalog);\ntableEnv.useCatalog(CATALOG_NAME);\n\nString createDataBase = \"CREATE DATABASE IF NOT EXISTS `db1`\"\n + \" COMMENT 'space 1'\"\n + \" WITH (\"\n + \" 'partition_num' = '100',\"\n + \" 'replica_factor' = '3',\"\n + \" 'vid_type' = 'FIXED_STRING(10)'\"\n + \")\";\ntableEnv.executeSql(createDataBase);\n
Create a tag.
tableEnvironment.executeSql(\"CREATE TABLE `person` (\"\n + \" vid BIGINT,\"\n + \" col1 STRING,\"\n + \" col2 STRING,\"\n + \" col3 BIGINT,\"\n + \" col4 BIGINT,\"\n + \" col5 BIGINT,\"\n + \" col6 BIGINT,\"\n + \" col7 DATE,\"\n + \" col8 TIMESTAMP,\"\n + \" col9 BIGINT,\"\n + \" col10 BOOLEAN,\"\n + \" col11 DOUBLE,\"\n + \" col12 DOUBLE,\"\n + \" col13 TIME,\"\n + \" col14 STRING\"\n + \") WITH (\"\n + \" 'connector' = 'nebula',\"\n + \" 'meta-address' = '127.0.0.1:9559',\"\n + \" 'graph-address' = '127.0.0.1:9669',\"\n + \" 'username' = 'root',\"\n + \" 'password' = 'nebula',\"\n + \" 'data-type' = 'vertex',\"\n + \" 'graph-space' = 'flink_test',\"\n + \" 'label-name' = 'person'\"\n + \")\"\n);\n
Create an edge type.
tableEnvironment.executeSql(\"CREATE TABLE `friend` (\"\n + \" sid BIGINT,\"\n + \" did BIGINT,\"\n + \" rid BIGINT,\"\n + \" col1 STRING,\"\n + \" col2 STRING,\"\n + \" col3 BIGINT,\"\n + \" col4 BIGINT,\"\n + \" col5 BIGINT,\"\n + \" col6 BIGINT,\"\n + \" col7 DATE,\"\n + \" col8 TIMESTAMP,\"\n + \" col9 BIGINT,\"\n + \" col10 BOOLEAN,\"\n + \" col11 DOUBLE,\"\n + \" col12 DOUBLE,\"\n + \" col13 TIME,\"\n + \" col14 STRING\"\n + \") WITH (\"\n + \" 'connector' = 'nebula',\"\n + \" 'meta-address' = '127.0.0.1:9559',\"\n + \" 'graph-address' = '127.0.0.1:9669',\"\n + \" 'username' = 'root',\"\n + \" 'password' = 'nebula',\"\n + \" 'graph-space' = 'flink_test',\"\n + \" 'label-name' = 'friend',\"\n + \" 'data-type'='edge',\"\n + \" 'src-id-index'='0',\"\n + \" 'dst-id-index'='1',\"\n + \" 'rank-id-index'='2'\"\n + \")\"\n);\n
Queries the data of an edge type and inserts it into another edge type.
Table table = tableEnvironment.sqlQuery(\"SELECT * FROM `friend`\");\ntable.executeInsert(\"`friend_sink`\").await();\n
NebulaGraph Spark Connector is a Spark connector application for reading and writing NebulaGraph data in Spark standard format. NebulaGraph Spark Connector consists of two parts: Reader and Writer.
Reader
Provides a Spark SQL interface. This interface can be used to read NebulaGraph data. It reads one vertex or edge type data at a time and assemble the result into a Spark DataFrame.
Writer
Provides a Spark SQL interface. This interface can be used to write DataFrames into NebulaGraph in a row-by-row or batch-import way.
For more information, see NebulaGraph Spark Connector.
"},{"location":"connector/nebula-spark-connector/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Spark Connector version, the NebulaGraph core version and the Spark version is as follows.
Spark Connector version NebulaGraph version Spark version nebula-spark-connector_3.0-3.0-SNAPSHOT.jar nightly 3.x nebula-spark-connector_2.2-3.0-SNAPSHOT.jar nightly 2.2.x nebula-spark-connector-3.0-SNAPSHOT.jar nightly 2.4.x nebula-spark-connector_3.0-3.6.0.jar 3.x 3.x nebula-spark-connector_2.2-3.6.0.jar 3.x 2.2.x nebula-spark-connector-3.6.0.jar 3.x 2.4.x nebula-spark-connector_2.2-3.4.0.jar 3.x 2.2.x nebula-spark-connector-3.4.0.jar 3.x 2.4.x nebula-spark-connector_2.2-3.3.0.jar 3.x 2.2.x nebula-spark-connector-3.3.0.jar 3.x 2.4.x nebula-spark-connector-3.0.0.jar 3.x 2.4.x nebula-spark-connector-2.6.1.jar 2.6.0, 2.6.1 2.4.x nebula-spark-connector-2.6.0.jar 2.6.0, 2.6.1 2.4.x nebula-spark-connector-2.5.1.jar 2.5.0, 2.5.1 2.4.x nebula-spark-connector-2.5.0.jar 2.5.0, 2.5.1 2.4.x nebula-spark-connector-2.1.0.jar 2.0.0, 2.0.1 2.4.x nebula-spark-connector-2.0.1.jar 2.0.0, 2.0.1 2.4.x nebula-spark-connector-2.0.0.jar 2.0.0, 2.0.1 2.4.x"},{"location":"connector/nebula-spark-connector/#use_cases","title":"Use cases","text":"NebulaGraph Spark Connector applies to the following scenarios:
The features of NebulaGraph Spark Connector 3.6.0 are as follows:
insert
, update
and delete
, are supported. insert
mode will insert (overwrite) data, update
mode will only update existing data, and delete
mode will only delete data.Release
"},{"location":"connector/nebula-spark-connector/#get_nebulagraph_spark_connector","title":"Get NebulaGraph Spark Connector","text":""},{"location":"connector/nebula-spark-connector/#compile_and_package","title":"Compile and package","text":"Clone repository nebula-spark-connector
.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-spark-connector.git\n
Enter the nebula-spark-connector
directory.
Compile and package. The procedure varies with Spark versions.
Note
Spark of the corresponding version has been installed.
- Spark 2.4
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector -am -Pscala-2.11 -Pspark-2.4\n```\n
- Spark 2.2
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector_2.2 -am -Pscala-2.11 -Pspark-2.2\n```\n
- Spark 3.x
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector_3.0 -am -Pscala-2.12 -Pspark-3.0\n```\n
After compilation, a file similar to nebula-spark-connector-3.6.0-SHANPSHOT.jar
is generated in the directory target
of the folder.
Download
"},{"location":"connector/nebula-spark-connector/#how_to_use","title":"How to use","text":"When using NebulaGraph Spark Connector to reading and writing NebulaGraph data, You can refer to the following code.
# Read vertex and edge data from NebulaGraph.\nspark.read.nebula().loadVerticesToDF()\nspark.read.nebula().loadEdgesToDF()\n\n# Write dataframe data into NebulaGraph as vertex and edges.\ndataframe.write.nebula().writeVertices()\ndataframe.write.nebula().writeEdges()\n
nebula()
receives two configuration parameters, including connection configuration and read-write configuration.
Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
"},{"location":"connector/nebula-spark-connector/#reading_data_from_nebulagraph","title":"Reading data from NebulaGraph","text":"val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withConenctionRetry(2)\n .withExecuteRetry(2)\n .withTimeout(6000)\n .build()\n\nval nebulaReadVertexConfig: ReadNebulaConfig = ReadNebulaConfig\n .builder()\n .withSpace(\"test\")\n .withLabel(\"person\")\n .withNoColumn(false)\n .withReturnCols(List(\"birthday\"))\n .withLimit(10)\n .withPartitionNum(10)\n .build()\nval vertex = spark.read.nebula(config, nebulaReadVertexConfig).loadVerticesToDF()\n\nval nebulaReadEdgeConfig: ReadNebulaConfig = ReadNebulaConfig\n .builder()\n .withSpace(\"test\")\n .withLabel(\"knows\")\n .withNoColumn(false)\n .withReturnCols(List(\"degree\"))\n .withLimit(10)\n .withPartitionNum(10)\n .build()\nval edge = spark.read.nebula(config, nebulaReadEdgeConfig).loadEdgesToDF()\n
NebulaConnectionConfig
is the configuration for connecting to NebulaGraph, as described below.
withMetaAddress
Yes Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. Read data is no need to configure withGraphAddress
. withConnectionRetry
No The number of retries that the NebulaGraph Java Client connected to NebulaGraph. The default value is 1
. withExecuteRetry
No The number of retries that the NebulaGraph Java Client executed query statements. The default value is 1
. withTimeout
No The timeout for the NebulaGraph Java Client request response. The default value is 6000
, Unit: ms. ReadNebulaConfig
is the configuration to read NebulaGraph data, as described below.
withSpace
Yes NebulaGraph space name. withLabel
Yes The Tag or Edge type name within the NebulaGraph space. withNoColumn
No Whether the property is not read. The default value is false
, read property. If the value is true
, the property is not read, the withReturnCols
configuration is invalid. withReturnCols
No Configures the set of properties for vertex or edges to read. the format is List(property1,property2,...)
, The default value is List()
, indicating that all properties are read. withLimit
No Configure the number of rows of data read from the server by the NebulaGraph Java Storage Client at a time. The default value is 1000
. withPartitionNum
No Configures the number of Spark partitions to read the NebulaGraph data. The default value is 100
. This value should not exceed the number of slices in the graph space (partition_num). Note
The values of columns in a dataframe are automatically written to NebulaGraph as property values.
val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withGraphAddress(\"127.0.0.1:9669\")\n .withConenctionRetry(2)\n .build()\n\nval nebulaWriteVertexConfig: WriteNebulaVertexConfig = WriteNebulaVertexConfig \n .builder()\n .withSpace(\"test\")\n .withTag(\"person\")\n .withVidField(\"id\")\n .withVidPolicy(\"hash\")\n .withVidAsProp(true)\n .withUser(\"root\")\n .withPasswd(\"nebula\")\n .withBatch(1000)\n .build() \ndf.write.nebula(config, nebulaWriteVertexConfig).writeVertices()\n\nval nebulaWriteEdgeConfig: WriteNebulaEdgeConfig = WriteNebulaEdgeConfig \n .builder()\n .withSpace(\"test\")\n .withEdge(\"friend\")\n .withSrcIdField(\"src\")\n .withSrcPolicy(null)\n .withDstIdField(\"dst\")\n .withDstPolicy(null)\n .withRankField(\"degree\")\n .withSrcAsProperty(true)\n .withDstAsProperty(true)\n .withRankAsProperty(true)\n .withUser(\"root\")\n .withPasswd(\"nebula\")\n .withBatch(1000)\n .build()\ndf.write.nebula(config, nebulaWriteEdgeConfig).writeEdges()\n
The default write mode is insert
, which can be changed to update
or delete
via withWriteMode
configuration:
val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withGraphAddress(\"127.0.0.1:9669\")\n .build()\nval nebulaWriteVertexConfig = WriteNebulaVertexConfig\n .builder()\n .withSpace(\"test\")\n .withTag(\"person\")\n .withVidField(\"id\")\n .withVidAsProp(true)\n .withBatch(1000)\n .withWriteMode(WriteMode.UPDATE)\n .build()\ndf.write.nebula(config, nebulaWriteVertexConfig).writeVertices()\n
NebulaConnectionConfig
is the configuration for connecting to the nebula graph, as described below.
withMetaAddress
Yes Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. withGraphAddress
Yes Specifies the IP addresses and ports of Graph Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. withConnectionRetry
No Number of retries that the NebulaGraph Java Client connected to NebulaGraph. The default value is 1
. WriteNebulaVertexConfig
is the configuration of the write vertex, as described below.
withSpace
Yes NebulaGraph space name. withTag
Yes The Tag name that needs to be associated when a vertex is written. withVidField
Yes The column in the DataFrame as the vertex ID. withVidPolicy
No When writing the vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withVidAsProp
No Whether the column in the DataFrame that is the vertex ID is also written as an property. The default value is false
. If set to true
, make sure the Tag has the same property name as VidField
. withUser
No NebulaGraph username. If authentication is disabled, you do not need to configure the username and password. withPasswd
No The password for the NebulaGraph username. withBatch
Yes The number of rows of data written at a time. The default value is 1000
. withWriteMode
No Write mode. The optional values are insert
, update
and delete
. The default value is insert
. withDeleteEdge
No Whether to delete the related edges synchronously when deleting a vertex. The default value is false
. It takes effect when withWriteMode
is delete
. WriteNebulaEdgeConfig
is the configuration of the write edge, as described below.
withSpace
Yes NebulaGraph space name. withEdge
Yes The Edge type name that needs to be associated when a edge is written. withSrcIdField
Yes The column in the DataFrame as the vertex ID. withSrcPolicy
No When writing the starting vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withDstIdField
Yes The column in the DataFrame that serves as the destination vertex. withDstPolicy
No When writing the destination vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withRankField
No The column in the DataFrame as the rank. Rank is not written by default. withSrcAsProperty
No Whether the column in the DataFrame that is the starting vertex is also written as an property. The default value is false
. If set to true
, make sure Edge type has the same property name as SrcIdField
. withDstAsProperty
No Whether column that are destination vertex in the DataFrame are also written as property. The default value is false
. If set to true
, make sure Edge type has the same property name as DstIdField
. withRankAsProperty
No Whether column in the DataFrame that is the rank is also written as property.The default value is false
. If set to true
, make sure Edge type has the same property name as RankField
. withUser
No NebulaGraph username. If authentication is disabled, you do not need to configure the username and password. withPasswd
No The password for the NebulaGraph username. withBatch
Yes The number of rows of data written at a time. The default value is 1000
. withWriteMode
No Write mode. The optional values are insert
, update
and delete
. The default value is insert
. NebulaGraph Algorithm (Algorithm) is a Spark application based on GraphX. It uses a complete algorithm tool to perform graph computing on the data in the NebulaGraph database by submitting a Spark task. You can also programmatically use the algorithm under the lib repository to perform graph computing on DataFrame.
"},{"location":"graph-computing/nebula-algorithm/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Algorithm release and the NebulaGraph core release is as follows.
NebulaGraph NebulaGraph Algorithm nightly 3.0-SNAPSHOT 3.0.0 ~ 3.4.x 3.x.0 2.6.x 2.6.x 2.5.0\u30012.5.1 2.5.0 2.0.0\u30012.0.1 2.1.0"},{"location":"graph-computing/nebula-algorithm/#prerequisites","title":"Prerequisites","text":"Before using the NebulaGraph Algorithm, users need to confirm the following information:
Graph computing outputs vertex datasets, and the algorithm results are stored in DataFrames as the properties of vertices. You can do further operations such as statistics and filtering according to your business requirements.
!!!
Before Algorithm v3.1.0, when submitting the algorithm package directly, the data of the vertex ID must be an integer. That is, the vertex ID can be INT or String, but the data itself is an integer.\n
"},{"location":"graph-computing/nebula-algorithm/#supported_algorithms","title":"Supported algorithms","text":"The graph computing algorithms supported by NebulaGraph Algorithm are as follows.
Algorithm Description Scenario Properties name Properties type PageRank The rank of pages Web page ranking, key node mining pagerank double/string Louvain Louvain Community mining, hierarchical clustering louvain int/string KCore K core Community discovery, financial risk control kcore int/string LabelPropagation Label propagation Information spreading, advertising, and community discovery lpa int/string Hanp Label propagation advanced Community discovery, recommendation system hanp int/string ConnectedComponent Weakly connected component Community discovery, island discovery cc int/string StronglyConnectedComponent Strongly connected component Community discovery scc int/string ShortestPath The shortest path Path planning, network planning shortestpath string TriangleCount Triangle counting Network structure analysis trianglecount int/string GraphTriangleCount Graph triangle counting Network structure and tightness analysis count int BetweennessCentrality Intermediate centrality Key node mining, node influence computing betweenness double/string ClosenessCentrality Closeness centrality Key node mining, node influence computing closeness double/string DegreeStatic Degree of statistical Graph structure analysis degree,inDegree,outDegree int/string ClusteringCoefficient Aggregation coefficient Recommendation system, telecom fraud analysis clustercoefficient double/string Jaccard Jaccard similarity Similarity computing, recommendation system jaccard string BFS Breadth-First Search Sequence traversal, shortest path planning bfs string DFS Depth-First Search Sequence traversal, shortest path planning dfs string Node2Vec - Graph classification node2vec stringNote
When writing the algorithm results into the NebulaGraph, make sure that the tag in the corresponding graph space has properties names and data types corresponding to the table above.
"},{"location":"graph-computing/nebula-algorithm/#implementation_methods","title":"Implementation methods","text":"NebulaGraph Algorithm implements the graph calculating as follows:
Read the graph data of DataFrame from the NebulaGraph database using the NebulaGraph Spark Connector.
Transform the graph data of DataFrame to the GraphX graph.
Use graph algorithms provided by GraphX (such as PageRank) or self-implemented algorithms (such as Louvain).
For detailed implementation methods, see Scala file.
"},{"location":"graph-computing/nebula-algorithm/#get_nebulagraph_algorithm","title":"Get NebulaGraph Algorithm","text":""},{"location":"graph-computing/nebula-algorithm/#compile_and_package","title":"Compile and package","text":"Clone the repository nebula-algorithm
.
$ git clone -b v3.0.0 https://github.com/vesoft-inc/nebula-algorithm.git\n
Enter the directory nebula-algorithm
.
$ cd nebula-algorithm\n
Compile and package.
$ mvn clean package -Dgpg.skip -Dmaven.javadoc.skip=true -Dmaven.test.skip=true\n
After the compilation, a similar file nebula-algorithm-3.x.x.jar
is generated in the directory nebula-algorithm/target
.
Download address
"},{"location":"graph-computing/nebula-algorithm/#how_to_use","title":"How to use","text":"Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
"},{"location":"graph-computing/nebula-algorithm/#use_algorithm_interface_recommended","title":"Use algorithm interface (recommended)","text":"The lib
repository provides 10 common graph algorithms.
Add dependencies to the file pom.xml
.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>nebula-algorithm</artifactId>\n <version>3.0.0</version>\n</dependency>\n
Use the algorithm (take PageRank as an example) by filling in parameters. For more examples, see example.
Note
By default, the DataFrame that executes the algorithm sets the first column as the starting vertex, the second column as the destination vertex, and the third column as the edge weights (not the rank in the NebulaGraph).
val prConfig = new PRConfig(5, 1.0)\nval prResult = PageRankAlgo.apply(spark, data, prConfig, false)\n
If your vertex IDs are Strings, see Pagerank Example for how to encoding and decoding them.
Set the Configuration file.
{\n # Configurations related to Spark\n spark: {\n app: {\n name: LPA\n # The number of partitions of Spark\n partitionNum:100\n }\n master:local\n }\n\n data: {\n # Data source. Optional values are nebula, csv, and json.\n source: csv\n # Data sink. The algorithm result will be written into this sink. Optional values are nebula, csv, and text.\n sink: nebula\n # Whether the algorithm has a weight.\n hasWeight: false\n }\n\n # Configurations related to NebulaGraph\n nebula: {\n # Data source. When NebulaGraph is the data source of the graph computing, the configuration of `nebula.read` is valid.\n read: {\n # The IP addresses and ports of all Meta services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy NebulaGraph by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the status with `docker-compose ps`.\n metaAddress: \"192.168.*.10:9559\"\n # The name of the graph space in NebulaGraph.\n space: basketballplayer\n # Edge types in NebulaGraph. When there are multiple labels, the data of multiple edges will be merged.\n labels: [\"serve\"]\n # The property name of each edge type in NebulaGraph. This property will be used as the weight column of the algorithm. Make sure that it corresponds to the edge type.\n weightCols: [\"start_year\"]\n }\n\n # Data sink. When the graph computing result sinks into NebulaGraph, the configuration of `nebula.write` is valid.\n write:{\n # The IP addresses and ports of all Graph services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the status with `docker-compose ps`.\n graphAddress: \"192.168.*.11:9669\"\n # The IP addresses and ports of all Meta services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy NebulaGraph by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the staus with `docker-compose ps`.\n metaAddress: \"192.168.*.12:9559\"\n user:root\n pswd:nebula\n # Before submitting the graph computing task, create the graph space and tag.\n # The name of the graph space in NebulaGraph.\n space:nb\n # The name of the tag in NebulaGraph. The graph computing result will be written into this tag. The property name of this tag is as follows.\n # PageRank: pagerank\n # Louvain: louvain\n # ConnectedComponent: cc\n # StronglyConnectedComponent: scc\n # LabelPropagation: lpa\n # ShortestPath: shortestpath\n # DegreeStatic: degree,inDegree,outDegree\n # KCore: kcore\n # TriangleCount: tranglecpunt\n # BetweennessCentrality: betweennedss\n tag:pagerank\n }\n } \n\n local: {\n # Data source. When the data source is csv or json, the configuration of `local.read` is valid.\n read:{\n filePath: \"hdfs://127.0.0.1:9000/edge/work_for.csv\"\n # If the CSV file has a header or it is a json file, use the header. If not, use [_c0, _c1, _c2, ..., _cn] instead.\n # The header of the source VID column.\n srcId:\"_c0\"\n # The header of the destination VID column.\n dstId:\"_c1\"\n # The header of the weight column.\n weight: \"_c2\"\n # Whether the csv file has a header.\n header: false\n # The delimiter in the csv file.\n delimiter:\",\"\n }\n\n # Data sink. When the graph computing result sinks to the csv or text file, the configuration of `local.write` is valid.\n write:{\n resultPath:/tmp/\n }\n }\n\n\n algorithm: {\n # The algorithm to execute. Optional values are as follow: \n # pagerank, louvain, connectedcomponent, labelpropagation, shortestpaths, \n # degreestatic, kcore, stronglyconnectedcomponent, trianglecount ,\n # betweenness, graphtriangleCount.\n executeAlgo: pagerank\n\n # PageRank\n pagerank: {\n maxIter: 10\n resetProb: 0.15 \n encodeId:false # Configure true if the VID is of string type.\n }\n\n # Louvain\n louvain: {\n maxIter: 20\n internalIter: 10\n tol: 0.5\n encodeId:false # Configure true if the VID is of string type.\n }\n\n # ...\n\n}\n}\n
Note
When sink: nebula
is configured, it means that the algorithm results will be written back to the NebulaGraph cluster. The property names of the tag have implicit conventions. For details, see Supported algorithms section of this topic.
Submit the graph computing task.
${SPARK_HOME}/bin/spark-submit --master <mode> --class com.vesoft.nebula.algorithm.Main <nebula-algorithm-3.0.0.jar_path> -p <application.conf_path>\n
Example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.algorithm.Main /root/nebula-algorithm/target/nebula-algorithm-3.0-SNAPSHOT.jar -p /root/nebula-algorithm/src/main/resources/application.conf\n
NebulaGraph Importer (Importer) is a standalone tool for importing data from CSV files into NebulaGraph. Importer can read and batch import CSV file data from multiple data sources, and also supports batch update and delete operations.
"},{"location":"import-export/use-importer/#features","title":"Features","text":"The version correspondence between NebulaGraph and NebulaGraph Importer is as follows.
NebulaGraph version NebulaGraph Importer version 3.x.x 3.x.x, 4.x.x 2.x.x 2.x.x, 3.x.xNote
Importer 4.0.0 has redone the Importer for improved performance, but the configuration file is not compatible with older versions. It is recommended to use the new version of Importer.
"},{"location":"import-export/use-importer/#release_note","title":"Release note","text":"Release
"},{"location":"import-export/use-importer/#prerequisites","title":"Prerequisites","text":"Before using NebulaGraph Importer, make sure:
NebulaGraph service has been deployed. The deployment method is as follows:
manager.hooks.before.statements
.Prepare the CSV file to be imported and configure the YAML file to use the tool to batch write data into NebulaGraph.
Note
For details about the YAML configuration file, see Configuration File Description at the end of topic.
"},{"location":"import-export/use-importer/#download_binary_package_and_run","title":"Download binary package and run","text":"Download the executable binary package.
Note
The file installation path based on the RPM/DEB package is /usr/bin/nebula-importer
.
Under the directory where the binary file is located, run the following command to start importing data.
./<binary_file_name> --config <yaml_config_file_path>\n
Compiling the source code requires deploying a Golang environment. For details, see Build Go environment.
Clone repository.
git clone -b release-4.1 https://github.com/vesoft-inc/nebula-importer.git\n
Note
Use the correct branch. Different branches have different RPC protocols.
Access the directory nebula-importer
.
cd nebula-importer\n
Compile the source code.
make build\n
Start the service.
./bin/nebula-importer --config <yaml_config_file_path>\n
Instead of installing the Go locale locally, you can use Docker to pull the image of the NebulaGraph Importer and mount the local configuration file and CSV data file into the container. The command is as follows:
docker pull vesoft/nebula-importer:<version>\ndocker run --rm -ti \\\n --network=host \\\n -v <config_file>:<config_file> \\\n -v <data_dir>:<data_dir> \\\n vesoft/nebula-importer:<version> \\\n --config <config_file>\n
<config_file>
: The absolute path to the YAML configuration file.<data_dir>
: The absolute path to the CSV data file. If the file is not local, ignore this parameter.<version>
: NebulaGraph 3.x Please fill in 'v3'.Note
A relative path is recommended. If you use a local absolute path, check that the path maps to the path in the Docker.
Example:
docker pull vesoft/nebula-importer:v4\ndocker run --rm -ti \\\n --network=host \\\n -v /home/user/config.yaml:/home/user/config.yaml \\\n -v /home/user/data:/home/user/data \\\n vesoft/nebula-importer:v4 \\\n --config /home/user/config.yaml\n
"},{"location":"import-export/use-importer/#configuration_file_description","title":"Configuration File Description","text":"Various example configuration files are available within the Github of the NebulaGraph Importer. The configuration files are used to describe information about the files to be imported, NebulaGraph server information, etc. The following section describes the fields within the configuration file in categories.
Note
If users download a binary package, create the configuration file manually.
"},{"location":"import-export/use-importer/#client_configuration","title":"Client configuration","text":"Client configuration stores the configuration associated with the client's connection to the NebulaGraph.
The example configuration is as follows:
client:\n version: v3\n address: \"192.168.1.100:9669,192.168.1.101:9669\"\n user: root\n password: nebula\n ssl:\n enable: true\n certPath: \"/home/xxx/cert/importer.crt\"\n keyPath: \"/home/xxx/cert/importer.key\"\n caPath: \"/home/xxx/cert/root.crt\"\n insecureSkipVerify: false\n concurrencyPerAddress: 10\n reconnectInitialInterval: 1s\n retry: 3\n retryInitialInterval: 1s\n
Parameter Default value Required Description client.version
v3
Yes Specifies the major version of the NebulaGraph. Currently only v3
is supported. client.address
\"127.0.0.1:9669\"
Yes Specifies the address of the NebulaGraph. Multiple addresses are separated by commas. client.user
root
No NebulaGraph user name. client.password
nebula
No The password for the NebulaGraph user name. client.ssl.enable
false
No Specifies whether to enable SSL authentication. client.ssl.certPath
- No Specifies the storage path for the SSL public key certificate.This parameter is required when SSL authentication is enabled. client.ssl.keyPath
- No S pecifies the storage path for the SSL key.This parameter is required when SSL authentication is enabled. client.ssl.caPath
- No Specifies the storage path for the CA root certificate.This parameter is required when SSL authentication is enabled. client.ssl.insecureSkipVerify
false
No Specifies whether the client skips verifying the server's certificate chain and hostname. If set to true
, any certificate chain and hostname provided by the server is accepted. client.concurrencyPerAddress
10
No The number of concurrent client connections for a single graph service. client.retryInitialInterval
1s
No Reconnect interval time. client.retry
3
No The number of retries for failed execution of the nGQL statement. client.retryInitialInterval
1s
No Retry interval time."},{"location":"import-export/use-importer/#manager_configuration","title":"Manager configuration","text":"Manager configuration is a human-controlled configuration after connecting to the database.
The example configuration is as follows:
manager:\n spaceName: basic_string_examples\n batch: 128\n readerConcurrency: 50\n importerConcurrency: 512\n statsInterval: 10s\n hooks:\n before:\n - statements:\n - UPDATE CONFIGS storage:wal_ttl=3600;\n - UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = true };\n - statements:\n - |\n DROP SPACE IF EXISTS basic_string_examples;\n CREATE SPACE IF NOT EXISTS basic_string_examples(partition_num=5, replica_factor=1, vid_type=int);\n USE basic_string_examples;\n wait: 10s\n after:\n - statements:\n - |\n UPDATE CONFIGS storage:wal_ttl=86400;\n UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = false };\n
Parameter Default value Required Description manager.spaceName
- Yes Specifies the NebulaGraph space to import the data into. Do not support importing multiple map spaces at the same time. manager.batch
128
No The batch size for executing statements (global configuration). Setting the batch size individually for a data source can using the parameter sources.batch
below. manager.readerConcurrency
50
No The number of concurrent reads of the data source by the reader. manager.importerConcurrency
512
No The number of concurrent nGQL statements generated to be executed, and then will call the client to execute these nGQL statements. manager.statsInterval
10s
No The time interval for printing statistical information manager.hooks.before.[].statements
- No The command to execute in the graph space before importing. manager.hooks.before.[].wait
- No The wait time after statements
are executed. manager.hooks.after.[].statements
- No The commands to execute in the graph space after importing. manager.hooks.after.[].wait
- No The wait time after statements
are executed."},{"location":"import-export/use-importer/#log_configuration","title":"Log configuration","text":"Log configuration is the logging-related configuration.
The example configuration is as follows:
log:\n level: INFO\n console: true\n files:\n - logs/nebula-importer.log\n
Parameter Default value Required Description log.level
INFO
No Specifies the log level. Optional values are DEBUG
, INFO
, WARN
, ERROR
, PANIC
, FATAL
. log.console
true
No Whether to print the logs to console synchronously when storing logs. log.files
- No The log file path. The log directory must exist."},{"location":"import-export/use-importer/#source_configuration","title":"Source configuration","text":"The Source configuration requires the configuration of data source information, data processing methods, and Schema mapping.
The example configuration is as follows:
sources:\n - path: ./person.csv # Required. Specifies the path where the data files are stored. If a relative path is used, the path and current configuration file directory are spliced. Wildcard filename is also supported, for example: ./follower-*.csv, please make sure that all matching files with the same schema.\n# - s3: # AWS S3\n# endpoint: endpoint # Optional. The endpoint of S3 service, can be omitted if using AWS S3.\n# region: us-east-1 # Required. The region of S3 service.\n# bucket: gdelt-open-data # Required. The bucket of file in S3 service.\n# key: events/20190918.export.csv # Required. The object key of file in S3 service.\n# accessKeyID: \"\" # Optional. The access key of S3 service. If it is public data, no need to configure.\n# accessKeySecret: \"\" # Optional. The secret key of S3 service. If it is public data, no need to configure.\n# - oss:\n# endpoint: https://oss-cn-hangzhou.aliyuncs.com # Required. The endpoint of OSS service.\n# bucket: bucketName # Required. The bucket of file in OSS service.\n# key: objectKey # Required. The object key of file in OSS service.\n# accessKeyID: accessKey # Required. The access key of OSS service.\n# accessKeySecret: secretKey # Required. The secret key of OSS service.\n# - ftp:\n# host: 192.168.0.10 # Required. The host of FTP service.\n# port: 21 # Required. The port of FTP service.\n# user: user # Required. The user of FTP service.\n# password: password # Required. The password of FTP service.\n# path: \"/events/20190918.export.csv\" # Required. The path of file in the FTP service.\n# - sftp:\n# host: 192.168.0.10 # Required. The host of SFTP service.\n# port: 22 # Required. The port of SFTP service.\n# user: user # Required. The user of SFTP service.\n# password: password # Optional. The password of SFTP service.\n# keyFile: keyFile # Optional. The ssh key file path of SFTP service.\n# keyData: keyData $ Optional. The ssh key file content of SFTP service.\n# passphrase: passphrase # Optional. The ssh key passphrase of SFTP service.\n# path: \"/events/20190918.export.csv\" # Required. The path of file in the SFTP service.\n# - hdfs:\n# address: \"127.0.0.1:8020\" # Required. The address of HDFS service.\n# user: \"hdfs\" # Optional. The user of HDFS service.\n# servicePrincipalName: <Kerberos Service Principal Name> # Optional. The name of the Kerberos service instance for the HDFS service when Kerberos authentication is enabled.\n# krb5ConfigFile: <Kerberos config file> # Optional. The path to the Kerberos configuration file for the HDFS service when Kerberos authentication is enabled. Defaults to `/etc/krb5.conf`.\n# ccacheFile: <Kerberos ccache file> # Optional. The path to the Kerberos ccache file for the HDFS service when Kerberos authentication is enabled.\n# keyTabFile: <Kerberos keytab file> # Optional. The path to the Kerberos keytab file for the HDFS service when Kerberos authentication is enabled.\n# password: <Kerberos password> # Optional. The Kerberos password for the HDFS service when Kerberos authentication is enabled.\n# dataTransferProtection: <Kerberos Data Transfer Protection> # Optional. The type of transport encryption when Kerberos authentication is enabled. Optional values are `authentication`, `integrity`, `privacy`.\n# disablePAFXFAST: false # Optional. Whether to disable the use of PA_FX_FAST for clients.\n# path: \"/events/20190918.export.csv\" # Required. The path to the file in the HDFS service. Wildcard filenames are also supported, e.g. `/events/*.export.csv`, make sure all matching files have the same schema.\n# - gcs: # Google Cloud Storage\n# bucket: chicago-crime-sample # Required. The name of the bucket in the GCS service.\n# key: stats/000000000000.csv # Required. The path to the file in the GCS service.\n# withoutAuthentication: false # Optional. Whether to anonymize access. Defaults to false, which means access with credentials.\n# # When using credentials access, one of the credentialsFile and credentialsJSON parameters is sufficient.\n# credentialsFile: \"/path/to/your/credentials/file\" # Optional. The path to the credentials file for the GCS service.\n# credentialsJSON: '{ # Optional. The JSON content of the credentials for the GCS service.\n# \"type\": \"service_account\",\n# \"project_id\": \"your-project-id\",\n# \"private_key_id\": \"key-id\",\n# \"private_key\": \"-----BEGIN PRIVATE KEY-----\\nxxxxx\\n-----END PRIVATE KEY-----\\n\",\n# \"client_email\": \"your-client@your-project-id.iam.gserviceaccount.com\",\n# \"client_id\": \"client-id\",\n# \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n# \"token_uri\": \"https://oauth2.googleapis.com/token\",\n# \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n# \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/your-client%40your-project-id.iam.gserviceaccount.com\",\n# \"universe_domain\": \"googleapis.com\"\n# }'\n batch: 256\n csv:\n delimiter: \"|\"\n withHeader: false\n lazyQuotes: false\n tags:\n - name: Person\n# mode: INSERT\n# filter: \n# expr: Record[1] == \"XXX\" \n id:\n type: \"STRING\"\n function: \"hash\"\n# index: 0 \n concatItems:\n - person_\n - 0\n - _id\n props:\n - name: \"firstName\"\n type: \"STRING\"\n index: 1\n - name: \"lastName\"\n type: \"STRING\"\n index: 2\n - name: \"gender\"\n type: \"STRING\"\n index: 3\n nullable: true\n defaultValue: female\n - name: \"birthday\"\n type: \"DATE\"\n index: 4\n nullable: true\n nullValue: _NULL_\n - name: \"creationDate\"\n type: \"DATETIME\"\n index: 5\n - name: \"locationIP\"\n type: \"STRING\"\n index: 6\n - name: \"browserUsed\"\n type: \"STRING\"\n index: 7\n - path: ./knows.csv\n batch: 256\n edges:\n - name: KNOWS # person_knows_person\n# mode: INSERT\n# filter: \n# expr: Record[1] == \"XXX\"\n src:\n id:\n type: \"STRING\"\n concatItems:\n - person_\n - 0\n - _id\n dst:\n id:\n type: \"STRING\"\n concatItems:\n - person_\n - 1\n - _id\n props:\n - name: \"creationDate\"\n type: \"DATETIME\"\n index: 2\n nullable: true\n nullValue: _NULL_\n defaultValue: 0000-00-00T00:00:00\n
The configuration mainly includes the following parts:
sources.path
sources.s3
sources.oss
sources.ftp
sources.sftp
sources.hdfs
- No Specify data source information, such as local file, HDFS, and S3. Only one source can be configured for the source
. Configure multiple sources in multiple source
.See the comments in the example for configuration items for different data sources. sources.batch
256
No The batch size for executing statements when importing this data source. The priority is higher than manager.batch
. sources.csv.delimiter
,
No Specifies the delimiter for the CSV file. Only 1-character string separators are supported. Special characters like tabs (\\t
) and hexadecimal values (e.g., 0x03
or Ctrl+C
) must be properly escaped and enclosed in double quotes, such as \"\\t\"
for tabs and \"\\x03\"
or \"\\u0003\"
for hexadecimal values, instead of using single quotes. For details on escaping special characters in yaml format, see Escaped Characters. sources.csv.withHeader
false
No Whether to ignore the first record in the CSV file. sources.csv.lazyQuotes
false
No Whether to allow lazy quotes. If lazyQuotes
is true, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field. sources.tags.name
- Yes The tag name. sources.tags.mode
INSERT
No Batch operation types, including insert, update and delete. Optional values are INSERT
, UPDATE
and DELETE
. sources.tags.filter.expr
- No Filter the data and only import if the filter conditions are met. Supported comparison characters are ==
, ! =
, <
, >
, <=
and >=
. Logical operators supported are not
(!) , and
(&&) and or
(||). For example (Record[0] == \"Mahinda\" or Record[0] == \"Michael\") and Record[3] == \"male\"
. sources.tags.id.type
STRING
No The type of the VID. sources.tags.id.function
- No Functions to generate the VID. Currently, only function hash
are supported. sources.tags.id.index
- No The column number corresponding to the VID in the data file. If sources.tags.id.concatItems
is not configured, this parameter must be configured. sources.tags.id.concatItems
- No Used to concatenate two or more arrays, the concatenated items can be string
, int
or mixed. string
stands for a constant, int
for an index column. If this parameter is set, the sources.tags.id.index
parameter will not take effect. sources.tags.ignoreExistedIndex
true
No Whether to enable IGNORE_EXISTED_INDEX
, that is, do not update index after insertion vertex. sources.tags.props.name
- Yes The tag property name, which must match the Tag property in the database. sources.tags.props.type
STRING
No Property data type, supporting BOOL
, INT
, FLOAT
, DOUBLE
, STRING
, TIME
, TIMESTAMP
, DATE
, DATETIME
, GEOGRAPHY
, GEOGRAPHY(POINT)
, GEOGRAPHY(LINESTRING)
and GEOGRAPHY(POLYGON)
. sources.tags.props.index
- Yes The property corresponds to the column number in the data file. sources.tags.props.nullable
false
No Whether this prop property can be NULL
, optional values is true
or false
. sources.tags.props.nullValue
- No Ignored when nullable
is false
. The value used to determine whether it is a NULL
. The property is set to NULL
when the value is equal to nullValue
. sources.tags.props.alternativeIndices
- No Ignored when nullable
is false
. The property is fetched from records according to the indices in order until not equal to nullValue
. sources.tags.props.defaultValue
- No Ignored when nullable
is false
. The property default value, when all the values obtained by index
and alternativeIndices
are nullValue
. sources.edges.name
- Yes The edge type name. sources.edges.mode
INSERT
No Batch operation types, including insert, update and delete. Optional values are INSERT
, UPDATE
and DELETE
. sources.edges.filter.expr
- No Filter the data and only import if the filter conditions are met. Supported comparison characters are ==
, ! =
, <
, >
, <=
and >=
. Logical operators supported are not
(!) , and
(&&) and or
(||). For example (Record[0] == \"Mahinda\" or Record[0] == \"Michael\") and Record[3] == \"male\"
. sources.edges.src.id.type
STRING
No The data type of the VID at the starting vertex on the edge. sources.edges.src.id.index
- Yes The column number in the data file corresponding to the VID at the starting vertex on the edge. sources.edges.dst.id.type
STRING
No The data type of the VID at the destination vertex on the edge. sources.edges.dst.id.index
- Yes The column number in the data file corresponding to the VID at the destination vertex on the edge. sources.edges.rank.index
- No The column number in the data file corresponding to the rank on the edge. sources.edges.ignoreExistedIndex
true
No Whether to enable IGNORE_EXISTED_INDEX
, that is, do not update index after insertion vertex. sources.edges.props.name
- No The edge type property name, which must match the Tag property in the database. sources.edges.props.type
STRING
No Property data type, supporting BOOL
, INT
, FLOAT
, DOUBLE
, STRING
, TIME
, TIMESTAMP
, DATE
, DATETIME
, GEOGRAPHY
, GEOGRAPHY(POINT)
, GEOGRAPHY(LINESTRING)
and GEOGRAPHY(POLYGON)
. sources.edges.props.index
- No The property corresponds to the column number in the data file. sources.edges.props.nullable
- No Whether this prop property can be NULL
, optional values is true
or false
. sources.edges.props.nullValue
- No Ignored when nullable
is false
. The value used to determine whether it is a NULL
. The property is set to NULL
when the value is equal to nullValue
. sources.edges.props.defaultValue
- No Ignored when nullable
is false
. The property default value, when all the values obtained by index
and alternativeIndices
are nullValue
. Note
The sequence numbers of the columns in the CSV file start from 0, that is, the sequence numbers of the first column are 0, and the sequence numbers of the second column are 1.
"},{"location":"import-export/use-importer/#faq","title":"FAQ","text":""},{"location":"import-export/use-importer/#what_are_the_descriptions_of_the_fields_in_the_log_output","title":"What are the descriptions of the fields in the log output?","text":"The following is an example of a log entry:
\u201cmsg\u201d: \u201c44m20s 2h7m10s 25.85%(129 GiB/498 GiB) Records{Finished: 302016726, Failed: 0, Rate: 113538.13/s}, Requests{Finished: 181786, Failed: 0, Latency: 4.046496736s/4.06694393s, Rate: 68.34/s}, Processed{Finished: 908575178, Failed: 0, Rate: 341563.62/s}\u201d\n
The fields are described below:
44m20s 2h7m10s 25.85%(129 GiB/498 GiB)
corresponds to basic information about the importing process.Records
corresponds to the records of the CSV files.Finished
: The number of the completed records.Failed
: The number of the failed records.Rate
: The number of records imported per second.Requests
corresponds to the requests.Finished
: The number of the completed requests.Failed
: The number of the failed requests.Latency
: The time consumed by server-side requests / The time consumed by client-side requests.Rate
: The number of requests processed per second.Processed
corresponds to nodes and edges.Finished
: The number of the completed nodes and edges.Failed
: The number of the failed nodes and edges.Rate
: The number of nodes and edges processed per second.There are many ways to write NebulaGraph master:
The following figure shows the positions of these ways:
"},{"location":"import-export/write-tools/#export_tools","title":"Export tools","text":"Export the data in database to a CSV file or another graph space (different NebulaGraph database clusters are supported) using the export function of the Exchange.
Enterpriseonly
The export function is exclusively available in the Enterprise Edition. If you require access to this version, please contact us.
Could not resolve dependencies for project xxx
","text":"Please check the mirror
part of Maven installation directory libexec/conf/settings.xml
:
<mirror>\n <id>alimaven</id>\n <mirrorOf>central</mirrorOf>\n <name>aliyun maven</name>\n <url>http://maven.aliyun.com/nexus/content/repositories/central/</url>\n</mirror>\n
Check whether the value of mirrorOf
is configured to *
. If it is, change it to central
or *,!SparkPackagesRepo,!bintray-streamnative-maven
.
Reason: There are two dependency packages in Exchange's pom.xml
that are not in Maven's central repository. pom.xml
configures the repository address for these two dependencies. If the mirrorOf
value for the mirror address configured in Maven is *
, all dependencies will be downloaded from the Central repository, causing the download to fail.
Problem description: The system reports Could not find artifact com.vesoft:client:jar:xxx-SNAPSHOT
when compiling.
Cause: There is no local Maven repository for storing or downloading SNAPSHOT packages. The default central repository in Maven only stores official releases, not development versions (SNAPSHOT).
Solution: Add the following configuration in the profiles
scope of Maven's setting.xml
file:
<profile>\n <activation>\n <activeByDefault>true</activeByDefault>\n </activation>\n <repositories>\n <repository>\n <id>snapshots</id>\n <url>https://oss.sonatype.org/content/repositories/snapshots/</url>\n <snapshots>\n <enabled>true</enabled>\n </snapshots>\n </repository>\n </repositories>\n </profile>\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#execution","title":"Execution","text":""},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_javalangclassnotfoundexception_comvesoftnebulaexchangeexchange","title":"Q: Error: java.lang.ClassNotFoundException: com.vesoft.nebula.exchange.Exchange
","text":"To submit a task in Yarn-Cluster mode, run the following command, especially the two '--conf' commands in the example.
$SPARK_HOME/bin/spark-submit --class com.vesoft.nebula.exchange.Exchange \\\n--master yarn-cluster \\\n--files application.conf \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\nnebula-exchange-3.0.0.jar \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_method_name_xxx_not_found","title":"Q: Error: method name xxx not found
","text":"Generally, the port configuration is incorrect. Check the port configuration of the Meta service, Graph service, and Storage service.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_nosuchmethod_methodnotfound_exception_in_thread_main_javalangnosuchmethoderror_etc","title":"Q: Error: NoSuchMethod, MethodNotFound (Exception in thread \"main\" java.lang.NoSuchMethodError
, etc)","text":"Most errors are caused by JAR package conflicts or version conflicts. Check whether the version of the error reporting service is the same as that used in Exchange, especially Spark, Scala, and Hive.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_when_exchange_imports_hive_data_error_exception_in_thread_main_orgapachesparksqlanalysisexception_table_or_view_not_found","title":"Q: When Exchange imports Hive data, error:Exception in thread \"main\" org.apache.spark.sql.AnalysisException: Table or view not found
","text":"Check whether the -h
parameter is omitted in the command for submitting the Exchange task and whether the table and database are correct, and run the user-configured exec statement in spark-SQL to verify the correctness of the exec statement.
com.facebook.thrift.protocol.TProtocolException: Expected protocol id xxx
","text":"Check that the NebulaGraph service port is configured correctly.
--port
in the configuration file for each service.Execute docker-compose ps
in the nebula-docker-compose
directory, for example:
$ docker-compose ps\n Name Command State Ports\n---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up (healthy) 0.0.0.0:33205->19669/tcp, 0.0.0.0:33204->19670/tcp, 0.0.0.0:9669->9669/tcp\nnebula-docker-compose_metad0_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33165->19559/tcp, 0.0.0.0:33162->19560/tcp, 0.0.0.0:33167->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33166->19559/tcp, 0.0.0.0:33163->19560/tcp, 0.0.0.0:33168->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33161->19559/tcp, 0.0.0.0:33160->19560/tcp, 0.0.0.0:33164->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33180->19779/tcp, 0.0.0.0:33178->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33183->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33175->19779/tcp, 0.0.0.0:33172->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33177->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33184->19779/tcp, 0.0.0.0:33181->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33185->9779/tcp, 9780/tcp\n
Check the Ports
column to find the docker mapped port number, for example:
- The port number available for Graph service is 9669.
- The port number for Meta service are 33167, 33168, 33164.
- The port number for Storage service are 33183, 33177, 33185.
Exception in thread \"main\" com.facebook.thrift.protocol.TProtocolException: The field 'code' has been assigned the invalid value -4
","text":"Check whether the version of Exchange is the same as that of NebulaGraph. For more information, see Limitations.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_how_to_correct_the_encoding_error_when_importing_data_in_a_spark_environment","title":"Q: How to correct the encoding error when importing data in a Spark environment?","text":"It may happen if the property value of the data contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
Namely:
<spark_install_path>/bin/spark-submit --master \"local\" \\\n--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--class com.vesoft.nebula.exchange.Exchange \\\n<nebula-exchange-3.x.y.jar_path> -c <application.conf_path>\n
In YARN, use the following command:
<spark_install_path>/bin/spark-submit \\\n--class com.vesoft.nebula.exchange.Exchange \\\n--master yarn-cluster \\\n--files <application.conf_path> \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\n--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8 \\\n<nebula-exchange-3.x.y.jar_path> \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_orgrocksdbrocksdbexception_while_open_a_file_for_appending_pathsst1-xxxsst_no_such_file_or_directory","title":"Q: org.rocksdb.RocksDBException: While open a file for appending: /path/sst/1-xxx.sst: No such file or directory","text":"Solution:
/path
exists. If not, or if the path is set incorrectly, create or correct it./path
. If not, grant the permission.- limit: Represents the size of the token bucket.
- timeout: Represents the timeout period for obtaining the token.
The values of these four parameters can be adjusted appropriately according to the machine performance. If the leader of the Storage service changes during the import process, you can adjust the values of these four parameters to reduce the import speed.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#others","title":"Others","text":""},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_which_versions_of_nebulagraph_are_supported_by_exchange","title":"Q: Which versions of NebulaGraph are supported by Exchange?","text":"See Limitations.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_what_is_the_relationship_between_exchange_and_spark_writer","title":"Q: What is the relationship between Exchange and Spark Writer?","text":"Exchange is the Spark application developed based on Spark Writer. Both are suitable for bulk migration of cluster data to NebulaGraph in a distributed environment, but later maintenance work will be focused on Exchange. Compared with Spark Writer, Exchange has the following improvements:
This topic introduces how to get the JAR file of NebulaGraph Exchange.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#download_the_jar_file_directly","title":"Download the JAR file directly","text":"The JAR file of Exchange Community Edition can be downloaded directly.
To download Exchange Enterprise Edition, contact us.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#get_the_jar_file_by_compiling_the_source_code","title":"Get the JAR file by compiling the source code","text":"You can get the JAR file of Exchange Community Edition by compiling the source code. The following introduces how to compile the source code of Exchange.
Enterpriseonly
You can get Exchange Enterprise Edition in NebulaGraph Enterprise Edition Package only.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#prerequisites","title":"Prerequisites","text":"Clone the repository nebula-exchange
in the /
directory.
git clone -b release-3.7 https://github.com/vesoft-inc/nebula-exchange.git\n
Switch to the directory nebula-exchange
.
cd nebula-exchange\n
Package NebulaGraph Exchange. Run the following command based on the Spark version:
For Spark 2.2\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_2.2 -am -Pscala-2.11 -Pspark-2.2\n
For Spark 2.4\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_2.4 -am -Pscala-2.11 -Pspark-2.4\n
For Spark 3.0\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_3.0 -am -Pscala-2.12 -Pspark-3.0\n
After the compilation is successful, you can find the nebula-exchange_spark_x.x-release-3.7.jar
file in the nebula-exchange_spark_x.x/target/
directory. x.x
indicates the Spark version, for example, 2.4
.
Note
The JAR file version changes with the release of the NebulaGraph Java Client. Users can view the latest version on the Releases page.
When migrating data, you can refer to configuration file target/classes/application.conf
.
If downloading dependencies fails when compiling:
Modify the mirror
part of Maven installation directory libexec/conf/settings.xml
:
<mirror>\n <id>alimaven</id>\n <mirrorOf>central</mirrorOf>\n <name>aliyun maven</name>\n <url>http://maven.aliyun.com/nexus/content/repositories/central/</url>\n</mirror>\n
This topic describes some of the limitations of using Exchange 3.x.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-limitations/#environment","title":"Environment","text":"Exchange 3.x supports the following operating systems:
To ensure the healthy operation of Exchange, ensure that the following software has been installed on the machine:
Apache Spark. The requirements for Spark versions when using Exchange to export data from data sources are as follows. In the following table, Y means that the corresponding Spark version is supported, and N means not supported.
Note
Use the correct Exchange JAR file based on the Spark version. For example, for Spark version 2.4, use nebula-exchange_spark_2.4-3.7.0.jar.
Data source Spark 2.2 Spark 2.4 Spark 3 CSV file Y N Y JSON file Y Y Y ORC file Y Y Y Parquet file Y Y Y HBase Y Y Y MySQL Y Y Y PostgreSQL Y Y Y Oracle Y Y Y ClickHouse Y Y Y Neo4j N Y N Hive Y Y Y MaxCompute N Y N Pulsar N Y Untested Kafka N Y Untested NebulaGraph N Y NHadoop Distributed File System (HDFS) needs to be deployed in the following scenarios:
NebulaGraph Exchange (Exchange) is an Apache Spark\u2122 application for bulk migration of cluster data to NebulaGraph in a distributed environment, supporting batch and streaming data migration in a variety of formats.
Exchange consists of Reader, Processor, and Writer. After Reader reads data from different sources and returns a DataFrame, the Processor iterates through each row of the DataFrame and obtains the corresponding value based on the mapping between fields
in the configuration file. After iterating through the number of rows in the specified batch, Writer writes the captured data to the NebulaGraph at once. The following figure illustrates the process by which Exchange completes the data conversion and migration.
Exchange has two editions, the Community Edition and the Enterprise Edition. The Community Edition is open source developed on GitHub. The Enterprise Edition supports not only the functions of the Community Edition but also adds additional features. For details, see Comparisons.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#scenarios","title":"Scenarios","text":"Exchange applies to the following scenarios:
The data saved in NebulaGraph needs to be exported.
Enterpriseonly
Exporting the data saved in NebulaGraph is supported by Exchange Enterprise Edition only.
Exchange has the following advantages:
Resumable data import: It supports resumable data import to save time and improve data import efficiency.
Note
Resumable data import is currently supported when migrating Neo4j data only.
Exchange supports Spark versions 2.2.x, 2.4.x, and 3.x.x, which are named nebula-exchange_spark_2.2
, nebula-exchange_spark_2.4
, and nebula-exchange_spark_3.0
for different Spark versions.
The correspondence between the NebulaGraph Exchange version (the JAR version), the NebulaGraph core version and the Spark version is as follows.
Exchange version NebulaGraph version Spark version nebula-exchange_spark_3.0-3.0-SNAPSHOT.jar nightly 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar nightly 2.4.x nebula-exchange_spark_2.2-3.0-SNAPSHOT.jar nightly 2.2.x nebula-exchange_spark_3.0-3.4.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.4.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.4.0.jar 3.x.x 2.2.x nebula-exchange_spark_3.0-3.3.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.3.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.3.0.jar 3.x.x 2.2.x nebula-exchange_spark_3.0-3.0.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.0.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.0.0.jar 3.x.x 2.2.x nebula-exchange-2.6.3.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.2.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.1.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.0.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.5.2.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.5.1.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.5.0.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.1.0.jar 2.0.1\u30012.0.0 2.4.x nebula-exchange-2.0.1.jar 2.0.1\u30012.0.0 2.4.x nebula-exchange-2.0.0.jar 2.0.1\u30012.0.0 2.4.x"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#data_source","title":"Data source","text":"Exchange 3.7.0 supports converting data from the following formats or sources into vertexes and edges that NebulaGraph can recognize, and then importing them into NebulaGraph in the form of nGQL statements:
Data repository:
In addition to importing data as nGQL statements, Exchange supports generating SST files for data sources and then importing SST files via Console.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#release_note","title":"Release note","text":"Release
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/","title":"Options for import","text":"After editing the configuration file, run the following commands to import specified source data into the NebulaGraph database.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/#import_data","title":"Import data","text":"<spark_install_path>/bin/spark-submit --master \"spark://HOST:PORT\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> \n
Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
The following table lists command parameters.
Parameter Required Default value Description--class
Yes - Specify the main class of the driver. --master
Yes - Specify the URL of the master process in a Spark cluster. For more information, see master-urls. Optional values are:local
: Local Mode. Run Spark applications on a single thread. Suitable for importing small data sets in a test environment.yarn
: Run Spark applications on a YARN cluster. Suitable for importing large data sets in a production environment.spark://HOST:PORT
: Connect to the specified Spark standalone cluster.mesos://HOST:PORT
: Connect to the specified Mesos cluster.k8s://HOST:PORT
: Connect to the specified Kubernetes cluster. -c
/--config
Yes - Specify the path of the configuration file. -h
/--hive
No false
Specify whether importing Hive data is supported. -D
/--dry
No false
Specify whether to check the format of the configuration file. This parameter is used to check the format of the configuration file only, it does not check the validity of tags
and edges
configurations and does not import data. Don't add this parameter if you need to import data. -r
/--reload
No - Specify the path of the reload file that needs to be reloaded. For more Spark parameter configurations, see Spark Configuration.
Note
$SPARK_HOME/bin/spark-submit --master yarn \\\n--class com.vesoft.nebula.exchange.Exchange \\\n--files application.conf \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\nnebula-exchange-3.7.0.jar \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/#import_the_reload_file","title":"Import the reload file","text":"If some data fails to be imported during the import, the failed data will be stored in the reload file. Use the parameter -r
to import the data in reload file.
<spark_install_path>/bin/spark-submit --master \"spark://HOST:PORT\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> -r \"<reload_file_path>\" \n
If the import still fails, go to Official Forum for consultation.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/","title":"Parameters in the configuration file","text":"This topic describes how to automatically generate a template configuration file when users use NebulaGraph Exchange, and introduces the configuration file application.conf
.
Specify the data source to be imported with the following command to get the template configuration file corresponding to the data source.
java -cp <exchange_jar_package> com.vesoft.exchange.common.GenerateConfigTemplate -s <source_type> -p <config_file_save_path>\n
Example:
java -cp nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar com.vesoft.exchange.common.GenerateConfigTemplate -s csv -p /home/nebula/csv_application.conf\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#using_an_encrypted_password","title":"Using an encrypted password","text":"You can use either a plaintext password or an RSA encrypted password when setting the password for connecting to NebulaGraph in the configuration file.
To use an RSA-encrypted password, you need to configure the following settings in the configuration file:
nebula.pswd
is configured as the RSA encrypted password.nebula.privateKey
is configured as the key for RSA encryption.nebula.enableRSA
is configured as true
.Users can use their own tools for encryption, or they can use the encryption tool provided in Exchange's JAR package, for example:
spark-submit --master local --class com.vesoft.exchange.common.PasswordEncryption nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar -p nebula\n
The results returned are as follows:
=================== public key begin ===================\nMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCLl7LaNSEXlZo2hYiJqzxgyFBQdkxbQXYU/xQthsBJwjOPhkiY37nokzKnjNlp6mv5ZUomqxLsoNQHEJ6BZD4VPiaiElFAkTD+gyul1v8f3A446Fr2rnVLogWHnz8ECPt7X8jwmpiKOXkOPIhqU5E0Cua+Kk0nnVosbos/VShfiQIDAQAB\n=================== public key end ===================\n\n\n=================== private key begin ===================\nMIICeAIBADANBgkqhkiG9w0BAQEFAASCAmIwggJeAgEAAoGBAIuXsto1IReVmjaFiImrPGDIUFB2TFtBdhT/FC2GwEnCM4+GSJjfueiTMqeM2Wnqa/llSiarEuyg1AcQnoFkPhU+JqISUUCRMP6DK6XW/x/cDjjoWvaudUuiBYefPwQI+3tfyPCamIo5eQ48iGpTkTQK5r4qTSedWixuiz9VKF+JAgMBAAECgYADWbfEPwQ1UbTq3Bej3kVLuWMcG0rH4fFYnaq5UQOqgYvFRR7W9H+80lOj6+CIB0ViLgkylmaU4WNVbBOx3VsUFFWSqIIIviKubg8m8ey7KAd9X2wMEcUHi4JyS2+/WSacaXYS5LOmMevvuaOwLEV0QmyM+nNGRIjUdzCLR1935QJBAM+IF8YD5GnoAPPjGIDS1Ljhu/u/Gj6/YBCQKSHQ5+HxHEKjQ/YxQZ/otchmMZanYelf1y+byuJX3NZ04/KSGT8CQQCsMaoFO2rF5M84HpAXPi6yH2chbtz0VTKZworwUnpmMVbNUojf4VwzAyOhT1U5o0PpFbpi+NqQhC63VUN5k003AkEArI8vnVGNMlZbvG7e5/bmM9hWs2viSbxdB0inOtv2g1M1OV+B2gp405ru0/PNVcRV0HQFfCuhVfTSxmspQoAihwJBAJW6EZa/FZbB4JVxreUoAr6Lo8dkeOhT9M3SZbGWZivaFxot/Cp/8QXCYwbuzrJxjqlsZUeOD6694Uk08JkURn0CQQC8V6aRa8ylMhLJFkGkMDHLqHcQCmY53Kd73mUu4+mjMJLZh14zQD9ydFtc0lbLXTeBAMWV3uEdeLhRvdAo3OwV\n=================== private key end ===================\n\n\n=================== encrypted password begin ===================\nIo+3y3mLOMnZJJNUPHZ8pKb4VfTvg6wUh6jSu5xdmLAoX/59tK1HTwoN40aOOWJwa1a5io7S4JqcX/jEcAorw7pelITr+F4oB0AMCt71d+gJuu3/lw9bjUEl9tF4Raj82y2Dg39wYbagN84fZMgCD63TPiDIevSr6+MFKASpGrY=\n=================== encrypted password end ===================\ncheck: the real password decrypted by private key and encrypted password is: nebula\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#configuration_instructions","title":"Configuration instructions","text":"Before configuring the application.conf
file, it is recommended to copy the file name application.conf
and then edit the file name according to the file type of a data source. For example, change the file name to csv_application.conf
if the file type of the data source is CSV.
The application.conf
file contains the following content types:
This topic lists only some Spark parameters. For more information, see Spark Configuration.
Parameter Type Default value Required Descriptionspark.app.name
string - No The drive name in Spark. spark.driver.cores
int 1
No The number of CPU cores used by a driver, only applicable to a cluster mode. spark.driver.maxResultSize
string 1G
No The total size limit (in bytes) of the serialized results of all partitions in a single Spark operation (such as collect). The minimum value is 1M, and 0 means unlimited. spark.executor.memory
string 1G
No The amount of memory used by a Spark driver which can be specified in units, such as 512M or 1G. spark.cores.max
int 16
No The maximum number of CPU cores of applications requested across clusters (rather than from each node) when a driver runs in a coarse-grained sharing mode on a standalone cluster or a Mesos cluster. The default value is spark.deploy.defaultCores
on a Spark standalone cluster manager or the value of the infinite
parameter (all available cores) on Mesos."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#hive_configurations_optional","title":"Hive configurations (optional)","text":"Users only need to configure parameters for connecting to Hive if Spark and Hive are deployed in different clusters. Otherwise, please ignore the following configurations.
Parameter Type Default value Required Descriptionhive.warehouse
string - Yes The warehouse path in HDFS. Enclose the path in double quotes and start with hdfs://
. hive.connectionURL
string - Yes The URL of a JDBC connection. For example, \"jdbc:mysql://127.0.0.1:3306/hive_spark?characterEncoding=UTF-8\"
. hive.connectionDriverName
string \"com.mysql.jdbc.Driver\"
Yes The driver name. hive.connectionUserName
list[string] - Yes The username for connections. hive.connectionPassword
list[string] - Yes The account password."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#nebulagraph_configurations","title":"NebulaGraph configurations","text":"Parameter Type Default value Required Description nebula.address.graph
list[string] [\"127.0.0.1:9669\"]
Yes The addresses of all Graph services, including IPs and ports, separated by commas (,). Example: [\"ip1:port1\",\"ip2:port2\",\"ip3:port3\"]
. nebula.address.meta
list[string] [\"127.0.0.1:9559\"]
Yes The addresses of all Meta services, including IPs and ports, separated by commas (,). Example: [\"ip1:port1\",\"ip2:port2\",\"ip3:port3\"]
. nebula.user
string - Yes The username with write permissions for NebulaGraph. nebula.pswd
string - Yes The account password. The password can be plaintext or RSA encrypted. To use an RSA encrypted password, you need to set enableRSA
and privateKey
. For how to encrypt a password, see Using an encrypted password above. nebula.enableRSA
bool false
No Whether to use an RSA encrypted password. nebula.privateKey
string - No The key used to encrypt the password using RSA. nebula.space
string - Yes The name of the graph space where data needs to be imported. nebula.ssl.enable.graph
bool false
Yes Enables the SSL encryption between Exchange and Graph services. If the value is true
, the SSL encryption is enabled and the following SSL parameters take effect. If Exchange is run on a multi-machine cluster, you need to store the corresponding files in the same path on each machine when setting the following SSL-related paths. nebula.ssl.sign
string ca
Yes Specifies the SSL sign. Optional values are ca
and self
. nebula.ssl.ca.param.caCrtFilePath
string Specifies the storage path of the CA certificate. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.ca.param.crtFilePath
string \"/path/crtFilePath\"
Yes Specifies the storage path of the CRT certificate. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.ca.param.keyFilePath
string \"/path/keyFilePath\"
Yes Specifies the storage path of the key file. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.self.param.crtFilePath
string \"/path/crtFilePath\"
Yes Specifies the storage path of the CRT certificate. It takes effect when the value of nebula.ssl.sign
is self
. nebula.ssl.self.param.keyFilePath
string \"/path/keyFilePath\"
Yes Specifies the storage path of the key file. It takes effect when the value of nebula.ssl.sign
is self
. nebula.ssl.self.param.password
string \"nebula\"
Yes Specifies the storage path of the password. It takes effect when the value of nebula.ssl.sign
is self
. nebula.path.local
string \"/tmp\"
No The local SST file path which needs to be set when users import SST files. nebula.path.remote
string \"/sst\"
No The remote SST file path which needs to be set when users import SST files. nebula.path.hdfs.namenode
string \"hdfs://name_node:9000\"
No The NameNode path which needs to be set when users import SST files. nebula.connection.timeout
int 3000
No The timeout set for Thrift connections. Unit: ms. nebula.connection.retry
int 3
No Retries set for Thrift connections. nebula.execution.retry
int 3
No Retries set for executing nGQL statements. nebula.error.max
int 32
No The maximum number of failures during the import process. When the number of failures reaches the maximum, the Spark job submitted will stop automatically . nebula.error.output
string /tmp/errors
No The path to output error logs. Failed nGQL statement executions are saved in the error log. nebula.rate.limit
int 1024
No The limit on the number of tokens in the token bucket when importing data. nebula.rate.timeout
int 1000
No The timeout period for getting tokens from a token bucket. Unit: milliseconds. Note
NebulaGraph doesn't support vertices without tags by default. To import vertices without tags, enable vertices without tags in the NebulaGraph cluster and then add parameter nebula.enableTagless
to the Exchange configuration with the value true
. For example:
nebula: {\n address:{\n graph:[\"127.0.0.1:9669\"]\n meta:[\"127.0.0.1:9559\"]\n }\n user: root\n pswd: nebula\n space: test\n enableTagless: true\n ......\n\n }\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#vertex_configurations","title":"Vertex configurations","text":"For different data sources, the vertex configurations are different. There are many general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure vertices.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#general_parameters","title":"General parameters","text":"Parameter Type Default value Required Descriptiontags.name
string - Yes The tag name defined in NebulaGraph. tags.type.source
string - Yes Specify a data source. For example, csv
. tags.type.sink
string client
Yes Specify an import method. Optional values are client
and SST
. tags.writeMode
string INSERT
No Types of batch operations on data, including batch inserts, updates, and deletes. Optional values are INSERT
, UPDATE
, DELETE
. tags.deleteEdge
string false
No Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when tags.writeMode
is DELETE
. tags.fields
list[string] - Yes The header or column name of the column corresponding to properties. If there is a header or a column name, please use that name directly. If a CSV file does not have a header, use the form of [_c0, _c1, _c2]
to represent the first column, the second column, the third column, and so on. tags.nebula.fields
list[string] - Yes Property names defined in NebulaGraph, the order of which must correspond to tags.fields
. For example, [_c1, _c2]
corresponds to [name, age]
, which means that values in the second column are the values of the property name
, and values in the third column are the values of the property age
. tags.vertex.field
string - Yes The column of vertex IDs. For example, when a CSV file has no header, users can use _c0
to indicate values in the first column are vertex IDs. tags.vertex.udf.separator
string - No Support merging multiple columns by custom rules. This parameter specifies the join character. tags.vertex.udf.oldColNames
list - No Support merging multiple columns by custom rules. This parameter specifies the names of the columns to be merged. Multiple columns are separated by commas. tags.vertex.udf.newColName
string - No Support merging multiple columns by custom rules. This parameter specifies the new column name. tags.vertex.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. tags.vertex.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. tags.batch
int 256
Yes The maximum number of vertices written into NebulaGraph in a single batch. tags.partition
int 32
Yes The number of partitions to be created when the data is written to NebulaGraph. If tags.partition \u2264 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_parquetjsonorc_data_sources","title":"Specific parameters of Parquet/JSON/ORC data sources","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of vertex data files in HDFS. Enclose the path in double quotes and start with hdfs://
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_csv_data_sources","title":"Specific parameters of CSV data sources","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of vertex data files in HDFS. Enclose the path in double quotes and start with hdfs://
. tags.separator
string ,
Yes The separator. The default value is a comma (,). For special characters, such as the control character ^A
, you can use ASCII octal \\001
or UNICODE encoded hexadecimal \\u0001
, for the control character ^B
, use ASCII octal \\002
or UNICODE encoded hexadecimal \\u0002
, for the control character ^C
, use ASCII octal \\003
or UNICODE encoded hexadecimal \\u0003
. tags.header
bool true
Yes Whether the file has a header."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_hive_data_sources","title":"Specific parameters of Hive data sources","text":"Parameter Type Default value Required Description tags.exec
string - Yes The statement to query data sources. For example, select name,age from mooc.users
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_maxcompute_data_sources","title":"Specific parameters of MaxCompute data sources","text":"Parameter Type Default value Required Description tags.table
string - Yes The table name of the MaxCompute. tags.project
string - Yes The project name of the MaxCompute. tags.odpsUrl
string - Yes The odpsUrl of the MaxCompute service. For more information about odpsUrl, see Endpoints. tags.tunnelUrl
string - Yes The tunnelUrl of the MaxCompute service. For more information about tunnelUrl, see Endpoints. tags.accessKeyId
string - Yes The accessKeyId of the MaxCompute service. tags.accessKeySecret
string - Yes The accessKeySecret of the MaxCompute service. tags.partitionSpec
string - No Partition descriptions of MaxCompute tables. tags.sentence
string - No Statements to query data sources. The table name in the SQL statement is the same as the value of the table above."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_neo4j_data_sources","title":"Specific parameters of Neo4j data sources","text":"Parameter Type Default value Required Description tags.exec
string - Yes Statements to query data sources. For example: match (n:label) return n.neo4j-field-0
. tags.server
string \"bolt://127.0.0.1:7687\"
Yes The server address of Neo4j. tags.user
string - Yes The Neo4j username with read permissions. tags.password
string - Yes The account password. tags.database
string - Yes The name of the database where source data is saved in Neo4j. tags.check_point_path
string /tmp/test
No The directory set to import progress information, which is used for resuming transfers. If not set, the resuming transfer is disabled."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_mysqlpostgresql_data_sources","title":"Specific parameters of MySQL/PostgreSQL data sources","text":"Parameter Type Default value Required Description tags.host
string - Yes The MySQL/PostgreSQL server address. tags.port
string - Yes The MySQL/PostgreSQL server port. tags.database
string - Yes The database name. tags.table
string - Yes The name of a table used as a data source. tags.user
string - Yes The MySQL/PostgreSQL username with read permissions. tags.password
string - Yes The account password. tags.sentence
string - Yes Statements to query data sources. For example: \"select teamid, name from team order by teamid\"
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_oracle_data_sources","title":"Specific parameters of Oracle data sources","text":"Parameter Type Default value Required Description tags.url
string - Yes The Oracle server address. tags.driver
string - Yes The Oracle driver address. tags.user
string - Yes The Oracle username with read permissions. tags.password
string - Yes The account password. tags.table
string - Yes The name of a table used as a data source. tags.sentence
string - Yes Statements to query data sources. For example: \"select playerid, name, age from player\"
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_clickhouse_data_sources","title":"Specific parameters of ClickHouse data sources","text":"Parameter Type Default value Required Description tags.url
string - Yes The JDBC URL of ClickHouse. tags.user
string - Yes The ClickHouse username with read permissions. tags.password
string - Yes The account password. tags.numPartition
string - Yes The number of ClickHouse partitions. tags.sentence
string - Yes Statements to query data sources."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_hbase_data_sources","title":"Specific parameters of Hbase data sources","text":"Parameter Type Default value Required Description tags.host
string 127.0.0.1
Yes The Hbase server address. tags.port
string 2181
Yes The Hbase server port. tags.table
string - Yes The name of a table used as a data source. tags.columnFamily
string - Yes The column family to which a table belongs."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_pulsar_data_sources","title":"Specific parameters of Pulsar data sources","text":"Parameter Type Default value Required Description tags.service
string \"pulsar://localhost:6650\"
Yes The Pulsar server address. tags.admin
string \"http://localhost:8081\"
Yes The admin URL used to connect pulsar. tags.options.<topic|topics| topicsPattern>
string - Yes Options offered by Pulsar, which can be configured by choosing one from topic
, topics
, and topicsPattern
. tags.interval.seconds
int 10
Yes The interval for reading messages. Unit: seconds."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_kafka_data_sources","title":"Specific parameters of Kafka data sources","text":"Parameter Type Default value Required Description tags.service
string - Yes The Kafka server address. tags.topic
string - Yes The message type. tags.interval.seconds
int 10
Yes The interval for reading messages. Unit: seconds. tags.securityProtocol
string - No Kafka security protocol. tags.mechanism
string - No The security certification mechanism provided by SASL of Kafka. tags.kerberos
bool false
No Whether to enable Kerberos for security certification. If tags.mechanism
is kerberos
, this parameter must be set to true
. tags.kerberosServiceName
string - No Kerberos service name. If tags.kerberos
is true
, this parameter must be set."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_for_generating_sst_files","title":"Specific parameters for generating SST files","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of the source file specified to generate SST files. tags.repartitionWithNebula
bool true
No Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file. Enabling this function can reduce the time required to DOWNLOAD and INGEST SST files."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#edge_configurations","title":"Edge configurations","text":"For different data sources, configurations of edges are also different. There are general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure edges.
For the specific parameters of different data sources for edge configurations, please refer to the introduction of specific parameters of different data sources above, and pay attention to distinguishing tags and edges.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#general_parameters_1","title":"General parameters","text":"Parameter Type Default value Required Descriptionedges.name
string - Yes The edge type name defined in NebulaGraph. edges.type.source
string - Yes The data source of edges. For example, csv
. edges.type.sink
string client
Yes The method specified to import data. Optional values are client
and SST
. edges.writeMode
string INSERT
No Types of batch operations on data, including batch inserts, updates, and deletes. Optional values are INSERT
, UPDATE
, DELETE
. edges.fields
list[string] - Yes The header or column name of the column corresponding to properties. If there is a header or column name, please use that name directly. If a CSV file does not have a header, use the form of [_c0, _c1, _c2]
to represent the first column, the second column, the third column, and so on. edges.nebula.fields
list[string] - Yes Edge names defined in NebulaGraph, the order of which must correspond to edges.fields
. For example, [_c2, _c3]
corresponds to [start_year, end_year]
, which means that values in the third column are the values of the start year, and values in the fourth column are the values of the end year. edges.source.field
string - Yes The column of source vertices of edges. For example, _c0
indicates a value in the first column that is used as the source vertex of an edge. edges.source.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. edges.source.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. edges.target.field
string - Yes The column of destination vertices of edges. For example, _c0
indicates a value in the first column that is used as the destination vertex of an edge. edges.target.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. edges.target.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. edges.ranking
int - No The column of rank values. If not specified, all rank values are 0
by default. edges.batch
int 256
Yes The maximum number of edges written into NebulaGraph in a single batch. edges.partition
int 32
Yes The number of partitions to be created when the data is written to NebulaGraph. If edges.partition \u2264 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_for_generating_sst_files_1","title":"Specific parameters for generating SST files","text":"Parameter Type Default value Required Description edges.path
string - Yes The path of the source file specified to generate SST files. edges.repartitionWithNebula
bool true
No Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file. Enabling this function can reduce the time required to DOWNLOAD and INGEST SST files."},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/","title":"Import data from ClickHouse","text":"This topic provides an example of how to use Exchange to import data stored on ClickHouse into NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set ClickHouse data source configuration. In this example, the copied file is called clickhouse_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n# NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to ClickHouse.\n source: clickhouse\n # Specify how to import the data of vertexes into NebulaGraph: Client or SST.\n sink: client\n }\n\n # JDBC URL of ClickHouse\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n\n user:\"user\"\n password:\"123456\"\n\n # The number of ClickHouse partitions\n numPartition:\"5\"\n\n sentence:\"select * from player\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [name,age]\n nebula.fields: [name,age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: clickhouse\n sink: client\n }\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n user:\"user\"\n password:\"123456\"\n numPartition:\"5\"\n sentence:\"select * from team\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:teamid\n }\n batch: 256\n partition: 32\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to ClickHouse.\n source: clickhouse\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # JDBC URL of ClickHouse\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n\n user:\"user\"\n password:\"123456\"\n\n # The number of ClickHouse partitions.\n numPartition:\"5\"\n\n sentence:\"select * from follow\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertexes.\n source: {\n field:src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # In target, use a column in the follow table as the source of the edge's destination vertexes.\n target: {\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: clickhouse\n sink: client\n }\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n user:\"user\"\n password:\"123456\"\n numPartition:\"5\"\n sentence:\"select * from serve\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field:playerid\n }\n target: {\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import ClickHouse data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <clickhouse_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/clickhouse_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/","title":"Import data from CSV files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local CSV files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#step_2_process_csv_files","title":"Step 2: Process CSV files","text":"Confirm the following information:
Process CSV files to meet Schema requirements.
Note
Exchange supports uploading CSV files with or without headers.
Obtain the CSV file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set CSV data source configuration. In this example, the copied file is called csv_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example: \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example: \"file:///tmp/xx.csv\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has headers, use the actual column names.\n fields: [_c1, _c2]\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The value of vertex must be the same as the column names in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:_c0\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: csv\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.csv\"\n fields: [_c1]\n nebula.fields: [name]\n vertex: {\n field:_c0\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example: \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example: \"file:///tmp/xx.csv\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has headers, use the actual column names.\n fields: [_c2]\n\n # Specify the column names in the edge table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be the same as the column names in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: _c0\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: _c1\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # Specify a column as the source of the rank (optional).\n\n #ranking: rank\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: csv\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.csv\"\n fields: [_c2,_c3]\n nebula.fields: [start_year, end_year]\n source: {\n field: _c0\n }\n target: {\n field: _c1\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import CSV data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <csv_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/csv_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/","title":"Import data from HBase","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HBase.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in HBase. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
hbase(main):002:0> scan \"player\"\nROW COLUMN+CELL\n player100 column=cf:age, timestamp=1618881347530, value=42\n player100 column=cf:name, timestamp=1618881354604, value=Tim Duncan\n player101 column=cf:age, timestamp=1618881369124, value=36\n player101 column=cf:name, timestamp=1618881379102, value=Tony Parker\n player102 column=cf:age, timestamp=1618881386987, value=33\n player102 column=cf:name, timestamp=1618881393370, value=LaMarcus Aldridge\n player103 column=cf:age, timestamp=1618881402002, value=32\n player103 column=cf:name, timestamp=1618881407882, value=Rudy Gay\n ...\n\nhbase(main):003:0> scan \"team\"\nROW COLUMN+CELL\n team200 column=cf:name, timestamp=1618881445563, value=Warriors\n team201 column=cf:name, timestamp=1618881453636, value=Nuggets\n ...\n\nhbase(main):004:0> scan \"follow\"\nROW COLUMN+CELL\n player100 column=cf:degree, timestamp=1618881804853, value=95\n player100 column=cf:dst_player, timestamp=1618881791522, value=player101\n player101 column=cf:degree, timestamp=1618881824685, value=90\n player101 column=cf:dst_player, timestamp=1618881816042, value=player102\n ...\n\nhbase(main):005:0> scan \"serve\"\nROW COLUMN+CELL\n player100 column=cf:end_year, timestamp=1618881899333, value=2016\n player100 column=cf:start_year, timestamp=1618881890117, value=1997\n player100 column=cf:teamid, timestamp=1618881875739, value=team204\n ...\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set HBase data source configuration. In this example, the copied file is called hbase_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set information about Tag player.\n # If you want to set RowKey as the data source, enter rowkey and the actual column name of the column family.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to HBase.\n source: hbase\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"player\"\n columnFamily:\"cf\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # For example, if rowkey is the source of the VID, enter rowkey.\n vertex:{\n field:rowkey\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # Number of pieces of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set Tag Team information.\n {\n name: team\n type: {\n source: hbase\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"team\"\n columnFamily:\"cf\"\n fields: [name]\n nebula.fields: [name]\n vertex:{\n field:rowkey\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to HBase.\n source: hbase\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:2181\n table:\"follow\"\n columnFamily:\"cf\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source:{\n field:rowkey\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n\n target:{\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: hbase\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"serve\"\n columnFamily:\"cf\"\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source:{\n field:rowkey\n }\n\n target:{\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import HBase data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <hbase_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/hbase_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/","title":"Import data from Hive","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Hive.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in Hive. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
scala> spark.sql(\"describe basketball.player\").show\n+--------+---------+-------+\n|col_name|data_type|comment|\n+--------+---------+-------+\n|playerid| string| null|\n| age| bigint| null|\n| name| string| null|\n+--------+---------+-------+\n\nscala> spark.sql(\"describe basketball.team\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n| teamid| string| null|\n| name| string| null|\n+----------+---------+-------+\n\nscala> spark.sql(\"describe basketball.follow\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n|src_player| string| null|\n|dst_player| string| null|\n| degree| bigint| null|\n+----------+---------+-------+\n\nscala> spark.sql(\"describe basketball.serve\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n| playerid| string| null|\n| teamid| string| null|\n|start_year| bigint| null|\n| end_year| bigint| null|\n+----------+---------+-------+\n
Note
The Hive data type bigint
corresponds to the NebulaGraph int
.
This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_2_use_spark_sql_to_confirm_hive_sql_statements","title":"Step 2: Use Spark SQL to confirm Hive SQL statements","text":"After the Spark-shell environment is started, run the following statements to ensure that Spark can read data in Hive.
scala> sql(\"select playerid, age, name from basketball.player\").show\nscala> sql(\"select teamid, name from basketball.team\").show\nscala> sql(\"select src_player, dst_player, degree from basketball.follow\").show\nscala> sql(\"select playerid, teamid, start_year, end_year from basketball.serve\").show\n
The following is the result read from the table basketball.player
.
+---------+----+-----------------+\n| playerid| age| name|\n+---------+----+-----------------+\n|player100| 42| Tim Duncan|\n|player101| 36| Tony Parker|\n|player102| 33|LaMarcus Aldridge|\n|player103| 32| Rudy Gay|\n|player104| 32| Marco Belinelli|\n+---------+----+-----------------+\n...\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_3_modify_configuration_file","title":"Step 3: Modify configuration file","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Hive data source configuration. In this example, the copied file is called hive_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # If Spark and Hive are deployed in different clusters, you need to configure the parameters for connecting to Hive. Otherwise, skip these configurations.\n #hive: {\n # waredir: \"hdfs://NAMENODE_IP:9000/apps/svr/hive-xxx/warehouse/\"\n # connectionURL: \"jdbc:mysql://your_ip:3306/hive_spark?characterEncoding=UTF-8\"\n # connectionDriverName: \"com.mysql.jdbc.Driver\"\n # connectionUserName: \"user\"\n # connectionPassword: \"password\"\n #}\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Hive.\n source: hive\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Set the SQL statement to read the data of player table in basketball database.\n exec: \"select playerid, age, name from basketball.player\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex:{\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: hive\n sink: client\n }\n exec: \"select teamid, name from basketball.team\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Hive.\n source: hive\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Set the SQL statement to read the data of follow table in the basketball database.\n exec: \"select src_player, dst_player, degree from basketball.follow\"\n\n # Specify the column names in the follow table in Fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's starting vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: hive\n sink: client\n }\n exec: \"select playerid, teamid, start_year, end_year from basketball.serve\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import Hive data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <hive_application.conf_path> -h\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/hive_application.conf -h\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_6_optional_rebuild_indexes_in_nebulagraph","title":"Step 6: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/","title":"Import data from general JDBC","text":"JDBC data refers to the data of various databases accessed through the JDBC interface. This topic provides an example of how to use Exchange to export MySQL data and import to NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in MySQL. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
mysql> desc player;\n+----------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+----------+-------------+------+-----+---------+-------+\n| playerid | int | YES | | NULL | |\n| age | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+----------+-------------+------+-----+---------+-------+\n\nmysql> desc team;\n+--------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+--------+-------------+------+-----+---------+-------+\n| teamid | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+--------+-------------+------+-----+---------+-------+\n\nmysql> desc follow;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| src_player | int | YES | | NULL | |\n| dst_player | int | YES | | NULL | |\n| degree | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n\nmysql> desc serve;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| playerid | int | YES | | NULL | |\n| teamid | int | YES | | NULL | |\n| start_year | int | YES | | NULL | |\n| end_year | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set JDBC data source configuration. In this case, the copied file is called jdbc_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to JDBC.\n source: jdbc\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # URL of the JDBC data source. The example is MySql database.\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n\n # JDBC driver \n driver:\"com.mysql.cj.jdbc.Driver\"\n\n # Database user name and password\n user:\"root\"\n password:\"12345\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter, and can additionally configure sentence.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # nebula-exchange_spark_2.2 can configure this parameter. Multi-table queries are not supported. Only the table name needs to be written after from. The form `db.table` is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence:\"select playerid, age, name from player, team order by playerid\"\n\n # (optional)Multiple connections read parameters. See https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html\n partitionColumn:playerid # optional. Must be a numeric, date, or timestamp column from the table in question.\n lowerBound:1 # optional\n upperBound:5 # optional\n numPartitions:5 # optional\n\n\n fetchSize:2 # The JDBC fetch size, which determines how many rows to fetch per round trip.\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: jdbc\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n table:team\n sentence:\"select teamid, name from team order by teamid\"\n partitionColumn:teamid \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2 \n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to JDBC.\n source: jdbc\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter, and can additionally configure sentence.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # nebula-exchange_spark_2.2 can configure this parameter. Multi-table queries are not supported. Only the table name needs to be written after from. The form `db.table` is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence:\"select src_player,dst_player,degree from follow order by src_player\"\n\n partitionColumn:src_player \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2 \n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: jdbc\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n table:serve\n sentence:\"select playerid,teamid,start_year,end_year from serve order by playerid\"\n partitionColumn:playerid \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import general JDBC data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <jdbc_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/jdbc_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/","title":"Import data from JSON files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local JSON files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example. Some sample data are as follows:
player
{\"id\":\"player100\",\"age\":42,\"name\":\"Tim Duncan\"}\n{\"id\":\"player101\",\"age\":36,\"name\":\"Tony Parker\"}\n{\"id\":\"player102\",\"age\":33,\"name\":\"LaMarcus Aldridge\"}\n{\"id\":\"player103\",\"age\":32,\"name\":\"Rudy Gay\"}\n...\n
team
{\"id\":\"team200\",\"name\":\"Warriors\"}\n{\"id\":\"team201\",\"name\":\"Nuggets\"}\n...\n
follow
{\"src\":\"player100\",\"dst\":\"player101\",\"degree\":95}\n{\"src\":\"player101\",\"dst\":\"player102\",\"degree\":90}\n...\n
serve
{\"src\":\"player100\",\"dst\":\"team204\",\"start_year\":\"1997\",\"end_year\":\"2016\"}\n{\"src\":\"player101\",\"dst\":\"team204\",\"start_year\":\"1999\",\"end_year\":\"2018\"}\n...\n
This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_2_process_json_files","title":"Step 2: Process JSON files","text":"Confirm the following information:
Process JSON files to meet Schema requirements.
Obtain the JSON file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set JSON data source configuration. In this example, the copied file is called json_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\" \n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to JSON.\n source: json\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the JSON file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.json\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.json\"\n\n # Specify the key name in the JSON file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The value of vertex must be the same as that in the JSON file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n{\n name: team\n type: {\n source: json\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.json\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to JSON.\n source: json\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the JSON file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.json\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.json\"\n\n # Specify the key name in the JSON file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n\n # Specify the column names in the edge table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be the same as that in the JSON file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: json\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.json\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import JSON data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <json_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-echange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/json_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/","title":"Import data from Kafka","text":"This topic provides a simple guide to importing Data stored on Kafka into NebulaGraph using Exchange.
Compatibility
Please use Exchange 3.5.0/3.3.0/3.0.0 when importing Kafka data. In version 3.4.0, caching of imported data was added, and streaming data import is not supported.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.The following JAR files have been downloaded and placed in the directory SPARK_HOME/jars
of Spark:
tags.type.sink
and edges.type.sink
is client
.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"Note
If some data is stored in Kafka's value field, you need to modify the source code, get the value from Kafka, parse the value through the from_JSON function, and return it as a Dataframe.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set Kafka data source configuration. In this example, the copied file is called kafka_application.conf
. For details on each configuration item, see Parameters in the configuration file.
Note
When importing Kafka data, a configuration file can only handle one tag or edge type. If there are multiple tag or edge types, you need to create multiple configuration files.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n\n # The corresponding Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Kafka.\n source: kafka\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n # Kafka server address.\n service: \"127.0.0.1:9092\"\n # Message category.\n topic: \"topic_name1\"\n\n # If Kafka uses Kerberos for security certification, the following parameters need to be set. If Kafka uses SASL or SASL_PLAINTEXT for security certification, you do not need to set kerberos or kerberosServiceName.\n #securityProtocol: SASL_PLAINTEXT\n #mechanism: GASSAPI\n #kerberos: true\n #kerberosServiceName: kafka\n\n # Kafka data has a fixed domain name: key, value, topic, partition, offset, timestamp, timestampType.\n # If multiple fields need to be specified after Spark reads as DataFrame, separate them with commas.\n # Specify the field name in fields. For example, use key for name in NebulaGraph and value for age in Nebula, as shown in the following.\n fields: [key,value]\n nebula.fields: [name,age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The key is the same as the value above, indicating that key is used as both VID and property name.\n vertex:{\n field:key\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n # The consumer offsets. The default value is latest. Optional value are latest and earliest.\n startingOffsets: latest\n # Flow control, with a rate limit on the maximum offset processed per trigger interval, may not be configured.\n # maxOffsetsPerTrigger:10000\n }\n ]\n\n # Processing edges\n #edges: [\n # # Set the information about the Edge Type follow.\n # {\n # # The corresponding Edge Type name in NebulaGraph.\n # name: follow\n\n # type: {\n # # Specify the data source file format to Kafka.\n # source: kafka\n\n # # Specify how to import the Edge type data into NebulaGraph.\n # # Specify how to import the data into NebulaGraph. Only client is supported.\n # sink: client\n # }\n\n # # Kafka server address.\n # service: \"127.0.0.1:9092\"\n # # Message category.\n # topic: \"topic_name3\"\n\n # # If Kafka uses Kerberos for security certification, the following parameters need to be set. If Kafka uses SASL or SASL_PLAINTEXT for security certification, you do not need to set kerberos or kerberosServiceName.\n # #securityProtocol: SASL_PLAINTEXT\n # #mechanism: GASSAPI\n # #kerberos: true\n # #kerberosServiceName: kafka\n\n # # Kafka data has a fixed domain name: key, value, topic, partition, offset, timestamp, timestampType.\n # # If multiple fields need to be specified after Spark reads as DataFrame, separate them with commas.\n # # Specify the field name in fields. For example, use key for degree in Nebula, as shown in the following.\n # fields: [key]\n # nebula.fields: [degree]\n\n # # In source, use a column in the topic as the source of the edge's source vertex.\n # # In target, use a column in the topic as the source of the edge's destination vertex.\n # source:{\n # field:timestamp\n # # udf:{\n # # separator:\"_\"\n # # oldColNames:[field-0,field-1,field-2]\n # # newColName:new-field\n # # }\n # # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # # prefix:\"tag1\"\n # # Performs hashing operations on VIDs of type string.\n # # policy:hash\n # }\n\n\n # target:{\n # field:offset\n # # udf:{\n # # separator:\"_\"\n # # oldColNames:[field-0,field-1,field-2]\n # # newColName:new-field\n # # }\n # # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # # prefix:\"tag1\"\n # # Performs hashing operations on VIDs of type string.\n # # policy:hash\n # }\n\n # # (Optional) Specify a column as the source of the rank.\n # #ranking: rank\n\n # # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n # #writeMode: INSERT\n\n # # The number of data written to NebulaGraph in a single batch.\n # batch: 10\n\n # # The number of partitions to be created when the data is written to NebulaGraph.\n # partition: 10\n\n # # The interval for message reading. Unit: second.\n # interval.seconds: 10\n # # The consumer offsets. The default value is latest. Optional value are latest and earliest.\n # startingOffsets: latest\n # # Flow control, with a rate limit on the maximum offset processed per trigger interval, may not be configured.\n # # maxOffsetsPerTrigger:10000\n # }\n #]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Kafka data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <kafka_application.conf_path>\n
Note
Example:
No security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
Enable Kerberos security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf -Djava.security.krb5.conf=/path/krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf -Djava.security.krb5.conf=/path/krb5.conf\" \\\n--files /local/path/kafka_client_jaas.conf,/local/path/kafka.keytab,/local/path/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
Enable SASL/SASL_PLAINTEXT security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf\" \\\n--files /local/path/kafka_client_jaas.conf \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/","title":"Import data from MaxCompute","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in MaxCompute.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set MaxCompute data source configuration. In this example, the copied file is called maxcompute_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to MaxCompute.\n source: maxcompute\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Table name of MaxCompute.\n table:player\n\n # Project name of MaxCompute.\n project:project\n\n # OdpsUrl and tunnelUrl for the MaxCompute service.\n # The address is https://help.aliyun.com/document_detail/34951.html.\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n\n # AccessKeyId and accessKeySecret of the MaxCompute service.\n accessKeyId:xxx\n accessKeySecret:xxx\n\n # Partition description of the MaxCompute table. This configuration is optional.\n partitionSpec:\"dt='partition1'\"\n\n # Ensure that the table name in the SQL statement is the same as the value of the table above. This configuration is optional.\n sentence:\"select id, name, age, playerid from player where id < 10\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields:[name, age]\n nebula.fields:[name, age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex:{\n field: playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: maxcompute\n sink: client\n }\n table:team\n project:project\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n accessKeyId:xxx\n accessKeySecret:xxx\n partitionSpec:\"dt='partition1'\"\n sentence:\"select id, name, teamid from team where id < 10\"\n fields:[name]\n nebula.fields:[name]\n vertex:{\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type:{\n # Specify the data source file format to MaxCompute.\n source:maxcompute\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink:client\n }\n\n # Table name of MaxCompute.\n table:follow\n\n # Project name of MaxCompute.\n project:project\n\n # OdpsUrl and tunnelUrl for MaxCompute service.\n # The address is https://help.aliyun.com/document_detail/34951.html.\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n\n # AccessKeyId and accessKeySecret of the MaxCompute service.\n accessKeyId:xxx\n accessKeySecret:xxx\n\n # Partition description of the MaxCompute table. This configuration is optional.\n partitionSpec:\"dt='partition1'\"\n\n # Ensure that the table name in the SQL statement is the same as the value of the table above. This configuration is optional.\n sentence:\"select * from follow\"\n\n # Specify the column names in the follow table in Fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields:[degree]\n nebula.fields:[degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n source:{\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n target:{\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition:10\n\n # The number of data written to NebulaGraph in a single batch.\n batch:10\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type:{\n source:maxcompute\n sink:client\n }\n table:serve\n project:project\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n accessKeyId:xxx\n accessKeySecret:xxx\n partitionSpec:\"dt='partition1'\"\n sentence:\"select * from serve\"\n fields:[start_year,end_year]\n nebula.fields:[start_year,end_year]\n source:{\n field: playerid\n }\n target:{\n field: teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n partition:10\n batch:10\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import MaxCompute data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <maxcompute_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/maxcompute_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/","title":"Import data from MySQL/PostgreSQL","text":"This topic provides an example of how to use Exchange to export MySQL data and import to NebulaGraph. It also applies to exporting data from PostgreSQL into NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in MySQL. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
mysql> desc player;\n+----------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+----------+-------------+------+-----+---------+-------+\n| playerid | varchar(30) | YES | | NULL | |\n| age | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+----------+-------------+------+-----+---------+-------+\n\nmysql> desc team;\n+--------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+--------+-------------+------+-----+---------+-------+\n| teamid | varchar(30) | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+--------+-------------+------+-----+---------+-------+\n\nmysql> desc follow;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| src_player | varchar(30) | YES | | NULL | |\n| dst_player | varchar(30) | YES | | NULL | |\n| degree | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n\nmysql> desc serve;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| playerid | varchar(30) | YES | | NULL | |\n| teamid | varchar(30) | YES | | NULL | |\n| start_year | int | YES | | NULL | |\n| end_year | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.SPARK_HOME/jars
of Spark.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set MySQL data source configuration. In this case, the copied file is called mysql_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to MySQL.\n source: mysql\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n user:\"test\"\n password:\"123456\"\n database:\"basketball\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from people, player, team\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: mysql\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n database:\"basketball\"\n table:\"team\"\n user:\"test\"\n password:\"123456\"\n sentence:\"select teamid, name from team order by teamid;\"\n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to MySQL.\n source: mysql\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n user:\"test\"\n password:\"123456\"\n database:\"basketball\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from follow, serve\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: mysql\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n database:\"basketball\"\n table:\"serve\"\n user:\"test\"\n password:\"123456\"\n sentence:\"select playerid,teamid,start_year,end_year from serve order by playerid;\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import MySQL data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <mysql_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/mysql_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/","title":"Import data from Neo4j","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Neo4j.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#implementation_method","title":"Implementation method","text":"Exchange uses Neo4j Driver 4.0.1 to read Neo4j data. Before batch export, you need to write Cypher statements that are automatically executed based on labels and relationship types and the number of Spark partitions in the configuration file to improve data export performance.
When Exchange reads Neo4j data, it needs to do the following:
The Reader in Exchange replaces the statement following the Cypher RETURN
statement in the exec
part of the configuration file with COUNT(*)
, and executes this statement to get the total amount of data, then calculates the starting offset and size of each partition based on the number of Spark partitions.
(Optional) If the user has configured the check_point_path
directory, Reader reads the files in the directory. In the transferring state, Reader calculates the offset and size that each Spark partition should have.
In each Spark partition, the Reader in Exchange adds different SKIP
and LIMIT
statements to the Cypher statement and calls the Neo4j Driver for parallel execution to distribute data to different Spark partitions.
The Reader finally processes the returned data into a DataFrame.
At this point, Exchange has finished exporting the Neo4j data. The data is then written in parallel to the NebulaGraph database.
The whole process is illustrated below.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Hardware specifications:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#step_2_configuring_source_data","title":"Step 2: Configuring source data","text":"To speed up the export of Neo4j data, create indexes for the corresponding properties in the Neo4j database. For more information, refer to the Neo4j manual.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#step_3_modify_configuration_files","title":"Step 3: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Neo4j data source configuration. In this example, the copied file is called neo4j_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n space: basketballplayer\n\n connection: {\n timeout: 3000\n retry: 3\n }\n\n execution: {\n retry: 3\n }\n\n error: {\n max: 32\n output: /tmp/errors\n }\n\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n\n\n # Set the information about the Tag player\n {\n name: player\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n # bolt 3 does not support multiple databases, do not configure database names. 4 and above can configure database names.\n # database:neo4j\n exec: \"match (n:player) return n.id as id, n.age as age, n.name as name\"\n fields: [age,name]\n nebula.fields: [age,name]\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n # Set the information about the Tag Team\n {\n name: team\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n database:neo4j\n exec: \"match (n:team) return n.id as id,n.name as name\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow\n {\n name: follow\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n # bolt 3 does not support multiple databases, do not configure database names. 4 and above can configure database names.\n # database:neo4j\n exec: \"match (a:player)-[r:follow]->(b:player) return a.id as src, b.id as dst, r.degree as degree order by id(r)\"\n fields: [degree]\n nebula.fields: [degree]\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n # Set the information about the Edge Type serve\n {\n name: serve\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n database:neo4j\n exec: \"match (a:player)-[r:serve]->(b:team) return a.id as src, b.id as dst, r.start_year as start_year, r.end_year as end_year order by id(r)\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n #ranking: rank\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#exec_configuration","title":"Exec configuration","text":"When configuring either the tags.exec
or edges.exec
parameters, you need to fill in the Cypher query. To prevent loss of data during import, it is strongly recommended to include ORDER BY
clause in Cypher queries. Meanwhile, in order to improve data import efficiency, it is better to select indexed properties for ordering. If there is no index, users can also observe the default order and select the appropriate properties for ordering to improve efficiency. If the pattern of the default order cannot be found, users can order them by the ID of the vertex or relationship and set the partition
to a small value to reduce the ordering pressure of Neo4j.
Note
Using the ORDER BY
clause lengthens the data import time.
Exchange needs to execute different SKIP
and LIMIT
Cypher statements on different Spark partitions, so SKIP
and LIMIT
clauses cannot be included in the Cypher statements corresponding to tags.exec
and edges.exec
.
NebulaGraph uses ID as the unique primary key when creating vertexes and edges, overwriting the data in that primary key if it already exists. So, if a Neo4j property value is given as the NebulaGraph'S ID and the value is duplicated in Neo4j, duplicate IDs will be generated. One and only one of their corresponding data will be stored in the NebulaGraph, and the others will be overwritten. Because the data import process is concurrently writing data to NebulaGraph, the final saved data is not guaranteed to be the latest data in Neo4j.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#check_point_path_configuration","title":"check_point_path configuration","text":"If breakpoint transfers are enabled, to avoid data loss, the state of the database should not change between the breakpoint and the transfer. For example, data cannot be added or deleted, and the partition
quantity configuration should not be changed.
Run the following command to import Neo4j data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <neo4j_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/neo4j_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/","title":"Import data from Oracle","text":"This topic provides an example of how to use Exchange to export Oracle data and import to NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in Oracle. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
oracle> desc player;\n+-----------+-------+---------------+ \n| Column | Null | Type |\n+-----------+-------+---------------+ \n| PLAYERID | - | VARCHAR2(30) |\n| NAME | - | VARCHAR2(30) |\n| AGE | - | NUMBER |\n+-----------+-------+---------------+ \n\noracle> desc team;\n+-----------+-------+---------------+ \n| Column | Null | Type |\n+-----------+-------+---------------+ \n| TEAMID | - | VARCHAR2(30) |\n| NAME | - | VARCHAR2(30) |\n+-----------+-------+---------------+ \n\noracle> desc follow;\n+-------------+-------+---------------+ \n| Column | Null | Type |\n+-------------+-------+---------------+ \n| SRC_PLAYER | - | VARCHAR2(30) |\n| DST_PLAYER | - | VARCHAR2(30) |\n| DEGREE | - | NUMBER |\n+-------------+-------+---------------+ \n\noracle> desc serve;\n+------------+-------+---------------+ \n| Column | Null | Type |\n+------------+-------+---------------+ \n| PLAYERID | - | VARCHAR2(30) |\n| TEAMID | - | VARCHAR2(30) |\n| START_YEAR | - | NUMBER |\n| END_YEAR | - | NUMBER |\n+------------+-------+---------------+ \n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Oracle data source configuration. In this case, the copied file is called oracle_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Oracle.\n source: oracle\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from people, player, team\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: oracle\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n table: \"basketball.team\"\n sentence: \"select teamid, name from team\"\n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Oracle.\n source: oracle\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from follow, serve\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: oracle\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n table: \"basketball.serve\"\n sentence: \"select playerid, teamid, start_year, end_year from serve\"\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Oracle data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <oracle_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/oracle_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/","title":"Import data from ORC files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local ORC files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#step_2_process_orc_files","title":"Step 2: Process ORC files","text":"Confirm the following information:
Process ORC files to meet Schema requirements.
Obtain the ORC file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set ORC data source configuration. In this example, the copied file is called orc_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to ORC.\n source: orc\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the ORC file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.orc\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.orc\"\n\n # Specify the key name in the ORC file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [age,name]\n\n # Specify the property names defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n # The value of vertex must be consistent with the field in the ORC file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag team.\n {\n name: team\n type: {\n source: orc\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.orc\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to ORC.\n source: orc\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the ORC file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.orc\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.orc\"\n\n # Specify the key name in the ORC file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [degree]\n\n # Specify the property names defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be consistent with the field in the ORC file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge type serve.\n {\n name: serve\n type: {\n source: orc\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.orc\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import ORC data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <orc_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/orc_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/","title":"Import data from Parquet files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local Parquet files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#step_2_process_parquet_files","title":"Step 2: Process Parquet files","text":"Confirm the following information:
Process Parquet files to meet Schema requirements.
Obtain the Parquet file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set Parquet data source configuration. In this example, the copied file is called parquet_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Parquet.\n source: parquet\n\n # Specifies how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the Parquet file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.parquet\".\n path: \"hdfs://192.168.*.13:9000/data/vertex_player.parquet\"\n\n # Specify the key name in the Parquet file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [age,name]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n # The value of vertex must be consistent with the field in the Parquet file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag team.\n {\n name: team\n type: {\n source: parquet\n sink: client\n }\n path: \"hdfs://192.168.11.13:9000/data/vertex_team.parquet\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to Parquet.\n source: parquet\n\n # Specifies how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the Parquet file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.parquet\".\n path: \"hdfs://192.168.11.13:9000/data/edge_follow.parquet\"\n\n # Specify the key name in the Parquet file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [degree]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The values of vertex must be consistent with the fields in the Parquet file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge type serve.\n {\n name: serve\n type: {\n source: parquet\n sink: client\n }\n path: \"hdfs://192.168.11.13:9000/data/edge_serve.parquet\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import Parquet data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <parquet_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/parquet_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/","title":"Import data from Pulsar","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Pulsar.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.tags.type.sink
and edges.type.sink
is client
.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Pulsar data source configuration. In this example, the copied file is called pulsar_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertices\n tags: [\n # Set the information about the Tag player.\n {\n # The corresponding Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Pulsar.\n source: pulsar\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n # The address of the Pulsar server.\n service: \"pulsar://127.0.0.1:6650\"\n # admin.url of pulsar.\n admin: \"http://127.0.0.1:8081\"\n # The Pulsar option can be configured from topic, topics or topicsPattern.\n options: {\n topics: \"topic1,topic2\"\n }\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex:{\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: pulsar\n sink: client\n }\n service: \"pulsar://127.0.0.1:6650\"\n admin: \"http://127.0.0.1:8081\"\n options: {\n topics: \"topic1,topic2\"\n }\n fields: [name]\n nebula.fields: [name]\n vertex:{\n field:teamid\n }\n batch: 10\n partition: 10\n interval.seconds: 10\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about Edge Type follow\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Pulsar.\n source: pulsar\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n\n # The address of the Pulsar server.\n service: \"pulsar://127.0.0.1:6650\"\n # admin.url of pulsar.\n admin: \"http://127.0.0.1:8081\"\n # The Pulsar option can be configured from topic, topics or topicsPattern.\n options: {\n topics: \"topic1,topic2\"\n }\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source:{\n field:src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target:{\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n }\n\n # Set the information about the Edge Type serve\n {\n name: serve\n type: {\n source: Pulsar\n sink: client\n }\n service: \"pulsar://127.0.0.1:6650\"\n admin: \"http://127.0.0.1:8081\"\n options: {\n topics: \"topic1,topic2\"\n }\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source:{\n field:playerid\n }\n\n target:{\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 10\n partition: 10\n interval.seconds: 10\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Pulsar data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <pulsar_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/pulsar_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/","title":"Import data from SST files","text":"This topic provides an example of how to generate the data from the data source into an SST (Sorted String Table) file and save it on HDFS, and then import it into NebulaGraph. The sample data source is a CSV file.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#precautions","title":"Precautions","text":"Exchange supports two data import modes:
The following describes the scenarios, implementation methods, prerequisites, and steps for generating an SST file and importing data.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#scenarios","title":"Scenarios","text":"Suitable for online services, because the generation almost does not affect services (just reads the Schema), and the import speed is fast.
Caution
Although the import speed is fast, write operations in the corresponding space are blocked during the import period (about 10 seconds). Therefore, you are advised to import data in off-peak hours.
The underlying code in NebulaGraph uses RocksDB as the key-value storage engine. RocksDB is a storage engine based on the hard disk, providing a series of APIs for creating and importing SST files to help quickly import massive data.
The SST file is an internal file containing an arbitrarily long set of ordered key-value pairs for efficient storage of large amounts of key-value data. The entire process of generating SST files is mainly done by Exchange Reader, sstProcessor, and sstWriter. The whole data processing steps are as follows:
Reader reads data from the data source.
sstProcessor generates the SST file from the NebulaGraph's Schema information and uploads it to the HDFS. For details about the format of the SST file, see Data Storage Format.
sstWriter opens a file and inserts data. When generating SST files, keys must be written in sequence.
After the SST file is generated, RocksDB imports the SST file into NebulaGraph using the IngestExternalFile()
method. For example:
IngestExternalFileOptions ifo;\n# Import two SST files\nStatus s = db_->IngestExternalFile({\"/home/usr/file1.sst\", \"/home/usr/file2.sst\"}, ifo);\nif (!s.ok()) {\n printf(\"Error while adding file %s and %s, Error %s\\n\",\n file_path1.c_str(), file_path2.c_str(), s.ToString().c_str());\n return 1;\n}\n
When the IngestExternalFile()
method is called, RocksDB copies the file to the data directory by default and blocks the RocksDB write operation. If the key range in the SST file overwrites the Memtable key range, flush the Memtable to the hard disk. After placing the SST file in an optimal location in the LSM tree, assign a global serial number to the file and turn on the write operation.
This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
--ws_storage_http_port
in the Meta service configuration file is the same as --ws_http_port
in the Storage service configuration file. For example, 19779
.--ws_meta_http_port
in the Graph service configuration file is the same as --ws_http_port
in the Meta service configuration file. For example, 19559
..jar
file directly.JAVA_HOME
has been configured.The Hadoop service has been installed and started.
Note
-- move_Files =true
to the Storage Service configuration file.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_2_process_csv_files","title":"Step 2: Process CSV files","text":"Confirm the following information:
Process CSV files to meet Schema requirements.
Note
Exchange supports uploading CSV files with or without headers.
Obtain the CSV file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set SST data source configuration. In this example, the copied file is called sst_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n\n master:local\n\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n\n executor: {\n memory:1G\n }\n\n cores:{\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n graph:[\"192.8.168.XXX:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"192.8.168.XXX:9559\"]\n }\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n space: basketballplayer\n\n # SST file configuration\n path:{\n # The local directory that temporarily stores generated SST files\n local:\"/tmp\"\n\n # The path for storing the SST file in the HDFS\n remote:\"/sst\"\n\n # The NameNode address of HDFS, for example, \"hdfs://<ip/hostname>:<port>\"\n hdfs.namenode: \"hdfs://*.*.*.*:9000\"\n }\n\n # The connection parameters of clients\n connection: {\n # The timeout duration of socket connection and execution. Unit: milliseconds.\n timeout: 30000\n }\n\n error: {\n # The maximum number of failures that will exit the application.\n max: 32\n # Failed import jobs are logged in the output path.\n output: /tmp/errors\n }\n\n # Use Google's RateLimiter to limit requests to NebulaGraph.\n rate: {\n # Steady throughput of RateLimiter.\n limit: 1024\n\n # Get the allowed timeout duration from RateLimiter. Unit: milliseconds.\n timeout: 1000\n }\n }\n\n\n # Processing vertices\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: sst\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://<ip/hostname>:port/xx/xx.csv\".\n path: \"hdfs://*.*.*.*:9000/dataset/vertex_player.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has a header, use the actual column name.\n fields: [_c1, _c2]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in NebulaGraph.\n # The value of vertex must be consistent with the column name in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:_c0\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n\n # Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file.\n repartitionWithNebula: false\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: csv\n sink: sst\n }\n path: \"hdfs://*.*.*.*:9000/dataset/vertex_team.csv\"\n fields: [_c1]\n nebula.fields: [name]\n vertex: {\n field:_c0\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n repartitionWithNebula: false\n }\n # If more vertices need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: sst\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://<ip/hostname>:port/xx/xx.csv\".\n path: \"hdfs://*.*.*.*:9000/dataset/edge_follow.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has a header, use the actual column name.\n fields: [_c2]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertices.\n # The value of vertex must be consistent with the column name in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: _c0\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: _c1\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # (Optional) Specify a column as the source of the rank.\n\n #ranking: rank\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n\n # Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file.\n repartitionWithNebula: false\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: csv\n sink: sst\n }\n path: \"hdfs://*.*.*.*:9000/dataset/edge_serve.csv\"\n fields: [_c2,_c3]\n nebula.fields: [start_year, end_year]\n source: {\n field: _c0\n }\n target: {\n field: _c1\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n repartitionWithNebula: false\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_4_generate_the_sst_file","title":"Step 4: Generate the SST file","text":"Run the following command to generate the SST file from the CSV source file. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --conf spark.sql.shuffle.partition=<shuffle_concurrency> --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <sst_application.conf_path> \n
Note
When generating SST files, the shuffle operation of Spark will be involved. Note that the configuration of spark.sql.shuffle.partition
should be added when you submit the command.
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --conf spark.sql.shuffle.partition=200 --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/sst_application.conf\n
After the task is complete, you can view the generated SST file in the /sst
directory (specified by the nebula.path.remote
parameter) on HDFS.
Note
If you modify the Schema, such as rebuilding the graph space, modifying the Tag, or modifying the Edge type, you need to regenerate the SST file because the SST file verifies the space ID, Tag ID, and Edge ID.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_5_import_the_sst_file","title":"Step 5: Import the SST file","text":"Note
Confirm the following information before importing:
HADOOP_HOME
and JAVA_HOME
.--ws_storage_http_port
in the Meta service configuration file (add it manually if it does not exist) is the same as the --ws_http_port
in the Storage service configuration file. For example, both are 19779
.--ws_meta_http_port
in the Graph service configuration file (add it manually if it does not exist) is the same as the --ws_http_port
in the Meta service configuration file. For example, both are 19559
.Connect to the NebulaGraph database using the client tool and import the SST file as follows:
Run the following command to select the graph space you created earlier.
nebula> USE basketballplayer;\n
Run the following command to download the SST file:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://<hadoop_address>:<hadoop_port>/<sst_file_path>\";\n
For example:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://*.*.*.*:9000/sst\";\n
Run the following command to import the SST file:
nebula> SUBMIT JOB INGEST;\n
Note
download
folder in the space ID in the data/storage/nebula
directory in the NebulaGraph installation path, and then download the SST file again. If the space has multiple copies, the download
folder needs to be deleted on all machines where the copies are saved.SUBMIT JOB INGEST;
.Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/","title":"What is NebulaGraph Operator","text":""},{"location":"k8s-operator/1.introduction-to-nebula-operator/#concept","title":"Concept","text":"NebulaGraph Operator is a tool to automate the deployment, operation, and maintenance of NebulaGraph clusters on Kubernetes. Building upon the excellent scalability mechanism of Kubernetes, NebulaGraph introduced its operation and maintenance knowledge into the Kubernetes system, which makes NebulaGraph a real cloud-native graph database.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/#how_it_works","title":"How it works","text":"For resource types that do not exist within Kubernetes, you can register them by adding custom API objects. The common way is to use the CustomResourceDefinition.
NebulaGraph Operator abstracts the deployment management of NebulaGraph clusters as a CRD. By combining multiple built-in API objects including StatefulSet, Service, and ConfigMap, the routine management and maintenance of a NebulaGraph cluster are coded as a control loop in the Kubernetes system. When a CR instance is submitted, NebulaGraph Operator drives database clusters to the final state according to the control process.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/#features","title":"Features","text":"The following features are already available in NebulaGraph Operator:
NebulaGraph Operator does not support the v1.x version of NebulaGraph. NebulaGraph Operator version and the corresponding NebulaGraph version are as follows:
NebulaGraph NebulaGraph Operator 3.5.x ~ 3.6.0 1.5.0 ~ 1.7.x 3.0.0 ~ 3.4.1 1.3.0, 1.4.0 ~ 1.4.2 3.0.0 ~ 3.3.x 1.0.0, 1.1.0, 1.2.0 2.5.x ~ 2.6.x 0.9.0 2.5.x 0.8.0Legacy version compatibility
Release
"},{"location":"k8s-operator/5.FAQ/","title":"FAQ","text":""},{"location":"k8s-operator/5.FAQ/#does_nebulagraph_operator_support_the_v1x_version_of_nebulagraph","title":"Does NebulaGraph Operator support the v1.x version of NebulaGraph?","text":"No, because the v1.x version of NebulaGraph does not support DNS, and NebulaGraph Operator requires the use of DNS.
"},{"location":"k8s-operator/5.FAQ/#is_cluster_stability_guaranteed_if_using_local_storage","title":"Is cluster stability guaranteed if using local storage?","text":"There is no guarantee. Using local storage means that the Pod is bound to a specific node, and NebulaGraph Operator does not currently support failover in the event of a failure of the bound node.
"},{"location":"k8s-operator/5.FAQ/#how_to_ensure_the_stability_of_a_cluster_when_scaling_the_cluster","title":"How to ensure the stability of a cluster when scaling the cluster?","text":"It is suggested to back up data in advance so that you can roll back data in case of failure.
"},{"location":"k8s-operator/5.FAQ/#is_the_replica_in_the_operator_docs_the_same_as_the_replica_in_the_nebulagraph_core_docs","title":"Is the replica in the Operator docs the same as the replica in the NebulaGraph core docs?","text":"They are different concepts. A replica in the Operator docs indicates a pod replica in K8s, while a replica in the core docs is a replica of a NebulaGraph storage partition.
"},{"location":"k8s-operator/5.FAQ/#how_to_view_the_logs_of_each_service_in_the_nebulagraph_cluster","title":"How to view the logs of each service in the NebulaGraph cluster?","text":"To obtain the logs of each cluster service, you need to access the container and view the log files that are stored inside.
Steps to view the logs of each service in the NebulaGraph cluster:
# To view the name of the pod where the container you want to access is located. \n# Replace <cluster-name> with the name of the cluster.\nkubectl get pods -l app.kubernetes.io/cluster=<cluster-name>\n\n# To access the container within the pod, such as the nebula-graphd-0 container.\nkubectl exec -it nebula-graphd-0 -- /bin/bash\n\n# To go to /usr/local/nebula/logs directory to view the logs.\ncd /usr/local/nebula/logs\n
"},{"location":"k8s-operator/5.FAQ/#how_to_resolve_the_host_not_foundnebula-metadstoragedgraphd-0nebulametadstoragedgraphd-headlessdefaultsvcclusterlocal_error","title":"How to resolve the host not found:nebula-<metad|storaged|graphd>-0.nebula.<metad|storaged|graphd>-headless.default.svc.cluster.local
error?","text":"This error is generally caused by a DNS resolution failure, and you need to check whether the cluster domain has been modified. If the cluster domain has been modified, you need to modify the kubernetesClusterDomain
field in the NebulaGraph Operator configuration file accordingly. The steps for modifying the Operator configuration file are as follows:
View the Operator configuration file.
[abby@master ~]$ helm show values nebula-operator/nebula-operator \nimage:\n nebulaOperator:\n image: vesoft/nebula-operator:v1.8.0\n imagePullPolicy: Always\n kubeRBACProxy:\n image: bitnami/kube-rbac-proxy:0.14.2\n imagePullPolicy: Always\n kubeScheduler:\n image: registry.k8s.io/kube-scheduler:v1.24.11\n imagePullPolicy: Always\n\nimagePullSecrets: []\nkubernetesClusterDomain: \"\" # The cluster domain name, and the default is cluster.local.\n
Modify the value of the kubernetesClusterDomain
field to the updated cluster domain name.
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=<nebula-operator-system> --version=1.8.0 --set kubernetesClusterDomain=<cluster-domain>\n
is the namespace where Operator is located and is the updated domain name."},{"location":"k8s-operator/2.get-started/2.1.install-operator/","title":"Install NebulaGraph Operator","text":"You can deploy NebulaGraph Operator with Helm.
"},{"location":"k8s-operator/2.get-started/2.1.install-operator/#background","title":"Background","text":"NebulaGraph Operator automates the management of NebulaGraph clusters, and eliminates the need for you to install, scale, upgrade, and uninstall NebulaGraph clusters, which lightens the burden on managing different application versions.
"},{"location":"k8s-operator/2.get-started/2.1.install-operator/#prerequisites","title":"Prerequisites","text":"Before installing NebulaGraph Operator, you need to install the following software and ensure the correct version of the software :
Software Requirement Kubernetes >= 1.18 Helm >= 3.2.0 CoreDNS >= 1.6.0Note
Add the NebulaGraph Operator Helm repository.
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update information of available charts locally from repositories.
helm repo update\n
For more information about helm repo
, see Helm Repo.
Create a namespace for NebulaGraph Operator.
kubectl create namespace <namespace_name>\n
For example, run the following command to create a namespace named nebula-operator-system
.
kubectl create namespace nebula-operator-system\n
All the resources of NebulaGraph Operator are deployed in this namespace.
Install NebulaGraph Operator.
helm install nebula-operator nebula-operator/nebula-operator --namespace=<namespace_name> --version=${chart_version}\n
For example, the command to install NebulaGraph Operator of version 1.8.0 is as follows.
helm install nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0\n
1.8.0
is the version of the nebula-operator chart. When not specifying --version
, the latest version of the nebula-operator chart is used by default.
Run helm search repo -l nebula-operator
to see chart versions.
You can customize the configuration items of the NebulaGraph Operator chart before running the installation command. For more information, see Customize installation defaults.
View the information about the default-created CRD.
kubectl get crd\n
Output:
NAME CREATED AT\nnebulaautoscalers.autoscaling.nebula-graph.io 2023-11-01T04:16:51Z\nnebulaclusters.apps.nebula-graph.io 2023-10-12T07:55:32Z\nnebularestores.apps.nebula-graph.io 2023-02-04T23:01:00Z\n
Create a NebulaGraph cluster
"},{"location":"k8s-operator/2.get-started/2.3.create-cluster/","title":"Create a NebulaGraph cluster","text":"This topic introduces how to create a NebulaGraph cluster with the following two methods:
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
Add the NebulaGraph Operator Helm repository.
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update information of available charts locally from chart repositories.
helm repo update\n
Set environment variables to your desired values.
export NEBULA_CLUSTER_NAME=nebula # The desired NebulaGraph cluster name.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # The desired namespace where your NebulaGraph cluster locates.\nexport STORAGE_CLASS_NAME=fast-disks # The name of the StorageClass that has been created.\n
Create a namespace for your NebulaGraph cluster (If you have created one, skip this step).
kubectl create namespace \"${NEBULA_CLUSTER_NAMESPACE}\"\n
Apply the variables to the Helm chart to create a NebulaGraph cluster.
helm install \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\\n --set nameOverride=\"${NEBULA_CLUSTER_NAME}\" \\\n --set nebula.storageClassName=\"${STORAGE_CLASS_NAME}\" \\\n # Specify the version of the NebulaGraph cluster. \n --set nebula.version=vmaster \\ \n # Specify the version of the nebula-cluster chart. If not specified, the latest version of the chart is installed by default.\n # Run 'helm search repo nebula-operator/nebula-cluster' to view the available versions of the chart. \n --version=1.8.0 \\\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
The following example shows how to create a NebulaGraph cluster by creating a cluster named nebula
.
Create a namespace, for example, nebula
. If not specified, the default
namespace is used.
kubectl create namespace nebula\n
Define the cluster configuration file nebulacluster.yaml
.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n graphd:\n # Container image for the Graph service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n # Storage class name for storing Graph service logs.\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n imagePullPolicy: Always\n metad:\n # Container image for the Meta service.\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n storaged:\n # Container image for the Storage service.\n image: vesoft/nebula-storaged\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n
For more information about the other parameters, see Install NebulaGraph clusters.
Create a NebulaGraph cluster.
kubectl create -f nebulacluster.yaml\n
Output:
nebulacluster.apps.nebula-graph.io/nebula created\n
Check the status of the NebulaGraph cluster.
kubectl get nc nebula\n
Output:
NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\nnebula True 1 1 1 1 1 1 86s\n
Connect to a cluster
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/","title":"Connect to a NebulaGraph cluster","text":"After creating a NebulaGraph cluster with NebulaGraph Operator on Kubernetes, you can connect to NebulaGraph databases from within the cluster and outside the cluster.
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created on Kubernetes. For more information, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/#connect_to_nebulagraph_databases_from_within_a_nebulagraph_cluster","title":"Connect to NebulaGraph databases from within a NebulaGraph cluster","text":"You can create a ClusterIP
type Service to provide an access point to the NebulaGraph database for other Pods within the cluster. By using the Service's IP and the Graph service's port number (9669), you can connect to the NebulaGraph database. For more information, see ClusterIP.
Create a file named graphd-clusterip-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc\n namespace: default\nspec:\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: ClusterIP # Set the type to ClusterIP.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Create a ClusterIP Service.
kubectl create -f graphd-clusterip-service.yaml \n
Check the IP of the Service:
$ kubectl get service -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc ClusterIP 10.98.213.34 <none> 9669/TCP,19669/TCP,19670/TCP 23h\n...\n
Run the following command to connect to the NebulaGraph database using the IP of the <cluster-name>-graphd-svc
Service above:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_ip> -port <service_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 10.98.213.34 -port 9669 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name.-addr
: The IP of the ClusterIP
Service, used to connect to Graphd services.-port
: The port to connect to Graphd services, the default port of which is 9669
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
You can also connect to NebulaGraph databases with Fully Qualified Domain Name (FQDN). The domain format is <cluster-name>-graphd.<cluster-namespace>.svc.<CLUSTER_DOMAIN>
. The default value of CLUSTER_DOMAIN
is cluster.local
.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_name>-graphd-svc.default.svc.cluster.local -port <service_port> -u <username> -p <password>\n
service_port
is the port to connect to Graphd services, the default port of which is 9669
.
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr nebula-graphd-svc.default.svc.cluster.local -port 9669 -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
You can also create a ClusterIP
type Service to provide an access point to the NebulaGraph database for other Pods within the cluster. By using the Service's IP and the Graph service's port number (9669), you can connect to the NebulaGraph database. For more information, see ClusterIP.
Create a file named graphd-clusterip-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc\n namespace: default\nspec:\n externalTrafficPolicy: Local\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: ClusterIP # Set the type to ClusterIP.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Create a ClusterIP Service.
kubectl create -f graphd-clusterip-service.yaml \n
Check the IP of the Service:
$ kubectl get service -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc ClusterIP 10.98.213.34 <none> 9669/TCP,19669/TCP,19670/TCP 23h\n...\n
Run the following command to connect to the NebulaGraph database using the IP of the <cluster-name>-graphd-svc
Service above:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_ip> -port <service_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 10.98.213.34 -port 9669 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name.-addr
: The IP of the ClusterIP
Service, used to connect to Graphd services.-port
: The port to connect to Graphd services, the default port of which is 9669
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
You can also connect to NebulaGraph databases with Fully Qualified Domain Name (FQDN). The domain format is <cluster-name>-graphd.<cluster-namespace>.svc.<CLUSTER_DOMAIN>
. The default value of CLUSTER_DOMAIN
is cluster.local
.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_name>-graphd-svc.default.svc.cluster.local -port <service_port> -u <username> -p <password>\n
service_port
is the port to connect to Graphd services, the default port of which is 9669
.
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr nebula-graphd-svc.default.svc.cluster.local -port 9669 -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
NodePort
","text":"You can create a NodePort
type Service to access internal cluster services from outside the cluster using any node IP and the exposed node port. You can also utilize load balancing services provided by cloud vendors (such as Azure, AWS, etc.) by setting the Service type to LoadBalancer
. This allows external access to internal cluster services through the public IP and port of the load balancer provided by the cloud vendor.
The Service of type NodePort
forwards the front-end requests via the label selector spec.selector
to Graphd pods with labels app.kubernetes.io/cluster: <cluster-name>
and app.kubernetes.io/component: graphd
.
After creating a NebulaGraph cluster based on the example template, where spec.graphd.service.type=NodePort
, the NebulaGraph Operator will automatically create a NodePort type Service named <cluster-name>-graphd-svc
in the same namespace. You can directly connect to the NebulaGraph database through any node IP and the exposed node port (see step 4 below). You can also create a custom Service according to your needs.
Steps:
Create a YAML file named graphd-nodeport-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc-nodeport\n namespace: default\nspec:\n externalTrafficPolicy: Local\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: NodePort # Set the type to NodePort.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Run the following command to create a NodePort Service.
kubectl create -f graphd-nodeport-service.yaml\n
Check the port mapped on all of your cluster nodes.
kubectl get services -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\n
Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc-nodeport NodePort 10.107.153.129 <none> 9669:32236/TCP,19669:31674/TCP,19670:31057/TCP 24h\n...\n
As you see, the mapped port of NebulaGraph databases on all cluster nodes is 32236
.
Connect to NebulaGraph databases with your node IP and the node port above.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <node_ip> -port <node_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 192.168.8.24 -port 32236 -u root -p vesoft\nIf you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name. The above example uses nebula-console
.-addr
: The IP of any node in a NebulaGraph cluster. The above example uses 192.168.8.24
.-port
: The mapped port of NebulaGraph databases on all cluster nodes. The above example uses 32236
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr <node_ip> -port <node_port> -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
When dealing with multiple pods in a cluster, managing services for each pod separately is not a good practice. Ingress is a Kubernetes resource that provides a unified entry point for accessing multiple services. Ingress can be used to expose multiple services under a single IP address.
Nginx Ingress is an implementation of Kubernetes Ingress. Nginx Ingress watches the Ingress resource of a Kubernetes cluster and generates the Ingress rules into Nginx configurations that enable Nginx to forward 7 layers of traffic.
You can use Nginx Ingress to connect to a NebulaGraph cluster from outside the cluster using a combination of the host network and DaemonSet pattern.
Due to the use of HostNetwork
, Nginx Ingress pods may be scheduled on the same node (port conflicts will occur when multiple pods try to listen on the same port on the same node). To avoid this situation, Nginx Ingress is deployed on these nodes in DaemonSet mode (ensuring that a pod replica runs on each node in the cluster). You first need to select some nodes and label them for the specific deployment of Nginx Ingress.
Ingress does not support TCP or UDP services. For this reason, the nginx-ingress-controller pod uses the flags --tcp-services-configmap
and --udp-services-configmap
to point to an existing ConfigMap where the key refers to the external port to be used and the value refers to the format of the service to be exposed. The format of the value is <namespace/service_name>:<service_port>
.
For example, the configurations of the ConfigMap named as tcp-services
is as follows:
apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: tcp-services\n namespace: nginx-ingress\ndata:\n # update \n 9769: \"default/nebula-graphd-svc:9669\"\n
Steps are as follows.
Create a file named nginx-ingress-daemonset-hostnetwork.yaml
.
Click on nginx-ingress-daemonset-hostnetwork.yaml to view the complete content of the example YAML file.
Note
The resource objects in the YAML file above use the namespace nginx-ingress
. You can run kubectl create namespace nginx-ingress
to create this namespace, or you can customize the namespace.
Label a node where the DaemonSet named nginx-ingress-controller
in the above YAML file (The node used in this example is named worker2
with an IP of 192.168.8.160
) runs.
kubectl label node worker2 nginx-ingress=true\n
Run the following command to enable Nginx Ingress in the cluster you created.
kubectl create -f nginx-ingress-daemonset-hostnetwork.yaml\n
Output:
configmap/nginx-ingress-controller created\nconfigmap/tcp-services created\nserviceaccount/nginx-ingress created\nserviceaccount/nginx-ingress-backend created\nclusterrole.rbac.authorization.k8s.io/nginx-ingress created\nclusterrolebinding.rbac.authorization.k8s.io/nginx-ingress created\nrole.rbac.authorization.k8s.io/nginx-ingress created\nrolebinding.rbac.authorization.k8s.io/nginx-ingress created\nservice/nginx-ingress-controller-metrics created\nservice/nginx-ingress-default-backend created\nservice/nginx-ingress-proxy-tcp created\ndaemonset.apps/nginx-ingress-controller created\n
Since the network type that is configured in Nginx Ingress is hostNetwork
, after successfully deploying Nginx Ingress, with the IP (192.168.8.160
) of the node where Nginx Ingress is deployed and with the external port (9769
) you define, you can access NebulaGraph.
Use the IP address and the port configured in the preceding steps. You can connect to NebulaGraph with NebulaGraph Console.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <host_ip> -port <external_port> -u <username> -p <password>\n
Output:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 192.168.8.160 -port 9769 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
The custom Pod name. The above example uses nebula-console
.-addr
: The IP of the node where Nginx Ingress is deployed. The above example uses 192.168.8.160
.-port
: The port used for external network access. The above example uses 9769
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n(root@nebula) [(none)]>\n
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr <ingress_host_ip> -port <external_port> -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
This topic introduces how to customize the default configurations when installing NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.1.customize-installation/#customizable_parameters","title":"Customizable parameters","text":"When executing the helm install [NAME] [CHART] [flags]
command to install a chart, you can specify the chart configuration. For more information, see Customizing the Chart Before Installing.
You can view the configurable options in the nebula-operator chart configuration file. Alternatively, you can view the configurable options through the command helm show values nebula-operator/nebula-operator
, as shown below.
[root@master ~]$ helm show values nebula-operator/nebula-operator \nimage:\n nebulaOperator:\n image: vesoft/nebula-operator:v1.8.0\n imagePullPolicy: Always\n\nimagePullSecrets: [ ]\nkubernetesClusterDomain: \"\"\n\ncontrollerManager:\n create: true\n replicas: 2\n env: [ ]\n resources:\n limits:\n cpu: 200m\n memory: 200Mi\n requests:\n cpu: 100m\n memory: 100Mi\n verbosity: 0\n ## Additional InitContainers to initialize the pod\n # Example:\n # extraInitContainers:\n # - name: init-auth-sidecar\n # command:\n # - /bin/sh\n # - -c\n # args:\n # - cp -R /certs/* /credentials/\n # imagePullPolicy: Always\n # image: reg.vesoft-inc.com/nebula-certs:latest\n # volumeMounts:\n # - name: credentials\n # mountPath: /credentials\n extraInitContainers: []\n\n # sidecarContainers - add more containers to controller-manager\n # Key/Value where Key is the sidecar `- name: <Key>`\n # Example:\n # sidecarContainers:\n # webserver:\n # image: nginx\n # OR for adding netshoot to controller manager\n # sidecarContainers:\n # netshoot:\n # args:\n # - -c\n # - while true; do ping localhost; sleep 60;done\n # command:\n # - /bin/bash\n # image: nicolaka/netshoot\n # imagePullPolicy: Always\n # name: netshoot\n # resources: {}\n sidecarContainers: {}\n\n ## Additional controller-manager Volumes\n extraVolumes: []\n\n ## Additional controller-manager Volume mounts\n extraVolumeMounts: []\n\n securityContext: {}\n # runAsNonRoot: true\n\nadmissionWebhook:\n create: false\n # The TCP port the Webhook server binds to. (default 9443)\n webhookBindPort: 9443\n\nscheduler:\n create: true\n schedulerName: nebula-scheduler\n replicas: 2\n env: [ ]\n resources:\n limits:\n cpu: 200m\n memory: 200Mi\n requests:\n cpu: 100m\n memory: 100Mi\n verbosity: 0\n plugins:\n enabled: [\"NodeZone\"]\n disabled: [] # Only in-tree plugins need to be defined here\n...\n
Part of the above parameters are described as follows:
Parameter Default value Descriptionimage.nebulaOperator.image
vesoft/nebula-operator:v1.8.0
The image of NebulaGraph Operator, version of which is 1.8.0. image.nebulaOperator.imagePullPolicy
IfNotPresent
The image pull policy in Kubernetes. imagePullSecrets
- The image pull secret in Kubernetes. For example imagePullSecrets[0].name=\"vesoft\"
. kubernetesClusterDomain
cluster.local
The cluster domain. controllerManager.create
true
Whether to enable the controller-manager component. controllerManager.replicas
2
The number of controller-manager replicas. controllerManager.env
[]
The environment variables for the controller-manager component. controllerManager.extraInitContainers
[]
Runs an init container. controllerManager.sidecarContainers
{}
Runs a sidecar container. controllerManager.extraVolumes
[]
Sets a storage volume. controllerManager.extraVolumeMounts
[]
Sets the storage volume mount path. controllerManager.securityContext
{}
Configures the access and control settings for NebulaGraph Operator. admissionWebhook.create
false
Whether to enable Admission Webhook. This option is disabled. To enable it, set the value to true
and you will need to install cert-manager. For details, see Enable admission control. admissionWebhook.webhookBindPort
9443
The TCP port the Webhook server binds to. It is 9443 by default. shceduler.create
true
Whether to enable Scheduler. shceduler.schedulerName
nebula-scheduler
The name of the scheduler customized by NebulaGraph Operator. shceduler.replicas
2
The number of nebula-scheduler replicas."},{"location":"k8s-operator/3.operator-management/3.1.customize-installation/#example","title":"Example","text":"The following example shows how to enable AdmissionWebhook when you install NebulaGraph Operator (AdmissionWebhook is disabled by default):
helm install nebula-operator nebula-operator/nebula-operator --namespace=<nebula-operator-system> --set admissionWebhook.create=true\n
Check whether the specified configuration of NebulaGraph Operator is installed successfully:
helm get values nebula-operator -n <nebula-operator-system>\n
Example output:
USER-SUPPLIED VALUES:\nadmissionWebhook:\n create: true\n
For more information about helm install
, see Helm Install.
This topic introduces how to update the configuration of NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.2.update-operator/#steps","title":"Steps","text":"Update the information of available charts locally from chart repositories.
helm repo update\n
View the default values of NebulaGraph Operator.
helm show values nebula-operator/nebula-operator\n
Update NebulaGraph Operator by passing configuration parameters via --set
.
--set
\uff1aOverrides values using the command line. For more configurable items, see Customize installation defaults.For example, to enable the AdmissionWebhook, run the following command:
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0 --set admissionWebhook.create=true\n
For more information, see Helm upgrade.
Check whether the configuration of NebulaGraph Operator is updated successfully.
helm get values nebula-operator -n nebula-operator-system\n
Example output:
USER-SUPPLIED VALUES:\nadmissionWebhook:\n create: true\n
Legacy version compatibility
View the current version of NebulaGraph Operator.
helm list --all-namespaces\n
Example output:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION\nnebula-operator nebula-operator-system 3 2023-11-06 12:06:24.742397418 +0800 CST deployed nebula-operator-1.7.0 1.7.0\n
Update the information of available charts locally from chart repositories.
helm repo update\n
View the latest version of NebulaGraph Operator.
helm search repo nebula-operator/nebula-operator\n
Example output:
NAME CHART VERSION APP VERSION DESCRIPTION\nnebula-operator/nebula-operator 1.8.0 1.8.0 Nebula Operator Helm chart for Kubernetes\n
Upgrade NebulaGraph Operator to version 1.8.0.
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=<namespace_name> --version=1.8.0\n
For example:
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0\n
Output:
Release \"nebula-operator\" has been upgraded. Happy Helming!\nNAME: nebula-operator\nLAST DEPLOYED: Tue Apr 16 02:21:08 2022\nNAMESPACE: nebula-operator-system\nSTATUS: deployed\nREVISION: 3\nTEST SUITE: None\nNOTES:\nNebulaGraph Operator installed!\n
Pull the latest CRD configuration file.
Note
You need to upgrade the corresponding CRD configurations after NebulaGraph Operator is upgraded. Otherwise, the creation of NebulaGraph clusters will fail. For information about the CRD configurations, see apps.nebula-graph.io_nebulaclusters.yaml.
Pull the NebulaGraph Operator chart package.
helm pull nebula-operator/nebula-operator --version=1.8.0\n
--version
: The NebulaGraph Operator version you want to upgrade to. If not specified, the latest version will be pulled.Run tar -zxvf
to unpack the charts.
For example: To unpack v1.8.0 chart to the /tmp
path, run the following command:
tar -zxvf nebula-operator-1.8.0.tgz -C /tmp\n
-C /tmp
: If not specified, the chart files will be unpacked to the current directory.Apply the latest CRD configuration file in the nebula-operator
directory.
kubectl apply -f crds/nebulaclusters.yaml\n
Output:
customresourcedefinition.apiextensions.k8s.io/nebulaclusters.apps.nebula-graph.io configured\n
This topic introduces how to uninstall NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.4.unistall-operator/#steps","title":"Steps","text":"Uninstall the NebulaGraph Operator chart.
helm uninstall nebula-operator --namespace=<nebula-operator-system>\n
View the information about the default-created CRD.
kubectl get crd\n
Output:
NAME CREATED AT\nnebulaautoscalers.autoscaling.nebula-graph.io 2023-11-01T04:16:51Z\nnebulaclusters.apps.nebula-graph.io 2023-10-12T07:55:32Z\nnebularestores.apps.nebula-graph.io 2023-02-04T23:01:00Z\n
Delete CRD.
kubectl delete crd nebulaclusters.apps.nebula-graph.io nebularestores.apps.nebula-graph.io nebulaautoscalers.autoscaling.nebula-graph.io\n
NebulaGraph Operator supports the management of multiple NebulaGraph clusters. By default, NebulaGraph Operator manages all NebulaGraph clusters. However, you can specify the clusters managed by NebulaGraph Operator. This topic describes how to specify the clusters managed by NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#application_scenarios","title":"Application scenarios","text":"NebulaGraph Operator supports specifying the clusters managed by controller-manager through startup parameters. The supported parameters are as follows:
watchNamespaces
: Specifies the namespace where the NebulaGraph cluster is located. To specify multiple namespaces, separate them with commas (,
). For example, watchNamespaces=default,nebula
. If this parameter is not specified, NebulaGraph Operator manages all NebulaGraph clusters in all namespaces.nebulaObjectSelector
: Allows you to set specific labels and values to select the NebulaGraph clusters to be managed. It supports three label operation symbols: =
, ==
, and !=
. Both =
and ==
mean that the label's value is equal to the specified value, while !=
means the tag's value is not equal to the specified value. Multiple labels are separated by commas (,
), and the comma needs to be escaped with \\\\
. For example, nebulaObjectSelector=key1=value1\\\\,key2=value2
, which selects only the NebulaGraph clusters with labels key1=value1
and key2=value2
. If this parameter is not specified, NebulaGraph Operator manages all NebulaGraph clusters.Run the following command to make NebulaGraph Operator manage only the NebulaGraph clusters in the default
and nebula
namespaces. Ensure that the current Helm Chart version supports this parameter. For more information, see Update the configuration.
helm upgrade nebula-operator nebula-operator/nebula-operator --set watchNamespaces=default,nebula\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#specify_the_managed_clusters_by_label","title":"Specify the managed clusters by label","text":"Run the following command to make NebulaGraph Operator manage only the NebulaGraph clusters with the labels key1=value1
and key2=value2
. Ensure that the current Helm Chart version supports this parameter. For more information, see Update the configuration.
helm upgrade nebula-operator nebula-operator/nebula-operator --set nebulaObjectSelector=key1=value1\\\\,key2=value2\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#faq","title":"FAQ","text":""},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_set_labels_for_nebulagraph_clusters","title":"How to set labels for NebulaGraph clusters?","text":"Run the following command to set a label for the NebulaGraph cluster:
kubectl label nc <cluster_name> -n <namespace> <key>=<value>\n
For example, set the label env=test
for the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl label nc nebula -n nebulaspace env=test\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_view_the_labels_of_nebulagraph_clusters","title":"How to view the labels of NebulaGraph clusters?","text":"Run the following command to view the labels of NebulaGraph clusters:
kubectl get nc <cluster_name> -n <namespace> --show-labels\n
For example, view the labels of the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl get nc nebula -n nebulaspace --show-labels\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_delete_the_labels_of_nebulagraph_clusters","title":"How to delete the labels of NebulaGraph clusters?","text":"Run the following command to delete the label of NebulaGraph clusters:
kubectl label nc <cluster_name> -n <namespace> <key>-\n
For example, delete the label env=test
of the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl label nc nebula -n nebulaspace env-\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_view_the_namespace_where_the_nebulagraph_cluster_is_located","title":"How to view the namespace where the NebulaGraph cluster is located?","text":"Run the following command to list all namespaces where the NebulaGraph clusters are located:
kubectl get nc --all-namespaces\n
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/","title":"Customize the configuration of the NebulaGraph cluster","text":"The Meta, Storage, and Graph services each have their default configurations within the NebulaGraph cluster. NebulaGraph Operator allows for the customization of these cluster service configurations. This topic describes how to update the settings of the NebulaGraph cluster.
Note
Configuring the parameters of the NebulaGraph cluster via Helm isn't currently supported.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#prerequisites","title":"Prerequisites","text":"A cluster is created using NebulaGraph Operator. For details, see Create a NebulaGraph Cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#configuration_method","title":"Configuration method","text":"You can update the configurations of cluster services by customizing parameters through spec.<metad|graphd|storaged>.config
. NebulaGraph Operator loads the configurations from config
into the corresponding service's ConfigMap, which is then mounted into the service's configuration file directory (/usr/local/nebula/etc/
) at the time of the service launch.
The structure of config
is as follows:
Config map[string]string `json:\"config,omitempty\"`\n
For instance, when updating the Graph service's enable_authorize
parameter settings, the spec.graphd.config
parameter can be specified at the time of cluster creation, or during cluster runtime.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n graphd:\n ...\n config: // Custom-defined parameters for the Graph service.\n \"enable_authorize\": \"true\" // Enable authorization. Default value is false.\n...\n
If you need to configure config
for the Meta and Storage services, add corresponding configuration items to spec.metad.config
and spec.storaged.config
.
For more detailed information on the parameters that can be set under the config
field, see the following:
Configuration parameters for cluster services fall into two categories: those which require a service restart for any updates; and those which can be dynamically updated during service runtime. For the latter type, the updates will not be saved; subsequent to a service restart, configurations will revert to the state as shown in the configuration file.
Regarding if the configuration parameters support dynamic updates during service runtime, please verify the information within the Whether supports runtime dynamic modifications column on each of the service configuration parameter detail pages linked above or see Dynamic runtime flags.
During the update of cluster service configurations, keep the following points in mind:
config
all allow for dynamic runtime updates, a service Pod restart will not be triggered and the configuration parameter updates will not be saved.config
include one or more that don\u2019t allow for dynamic runtime updates, a service Pod restart will be triggered, but only updates to those parameters that don\u2019t allow for dynamic updates will be saved.Note
If you wish to modify the parameter settings during cluster runtime without triggering a Pod restart, make sure that all the parameters support dynamic updates during runtime.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#customize_port_configuration","title":"Customize port configuration","text":"The following example demonstrates how to customize the port configurations for the Meta, Storage, and Graph services.
You can add port
and ws_http_port
parameters to the config
field in order to set custom ports. For detailed information regarding these two parameters, see the networking configuration sections at Meta Service Configuration Parameters, Storage Service Configuration Parameters, Graph Service Configuration Parameters.
Note
port
and ws_http_port
parameter settings, a Pod restart is triggered and then the updated settings take effect after the restart.port
parameter.Modify the cluster configuration file.
Open the cluster configuration file.
kubectl edit nc nebula\n
Modify the configuration file as follows.
Add the config
field to the graphd
, metad
, and storaged
sections to customize the port configurations for the Graph, Meta, and Storage services, respectively.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n graphd:\n config: // Custom port configuration for the Graph service.\n port: \"3669\"\n ws_http_port: \"8080\"\n resources:\n requests:\n cpu: \"200m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-graphd\n version: master\n metad: \n config: // Custom port configuration for the Meta service.\n ws_http_port: 8081\n resources:\n requests:\n cpu: \"300m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-metad\n version: master\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-path\n storaged: \n config: // Custom port configuration for the Storage service.\n ws_http_port: 8082\n resources:\n requests:\n cpu: \"300m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-storaged\n version: master\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-path\n enableAutoBalance: true\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n imagePullPolicy: IfNotPresent\n imagePullSecrets:\n - name: nebula-image\n enablePVReclaim: true\n topologySpreadConstraints:\n - topologyKey: kubernetes.io/hostname\n whenUnsatisfiable: \"ScheduleAnyway\"\n
Save the changes.
Changes will be saved automatically after saving the file.
Esc
to enter command mode.:wq
to save and exit.Validate that the configurations have taken effect.
kubectl get svc\n
Example output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-headless ClusterIP None <none> 3669/TCP,8080/TCP 10m\nnebula-graphd-svc ClusterIP 10.102.13.115 <none> 3669/TCP,8080/TCP 10m\nnebula-metad-headless ClusterIP None <none> 9559/TCP,8081/TCP 11m\nnebula-storaged-headless ClusterIP None <none> 9779/TCP,8082/TCP,9778/TCP 11m\n
As can be noticed, the Graph service's RPC daemon port is changed to 3669
(default 9669
), the HTTP port to 8080
(default 19669
); the Meta service's HTTP port is changed to 8081
(default 19559
); the Storage service's HTTP port is changed to 8082
(default 19779
).
Running logs of NebulaGraph cluster services (graphd, metad, storaged) are generated and stored in the /usr/local/nebula/logs
directory of each service container by default.
To view the running logs of a NebulaGraph cluster, you can use the kubectl logs
command.
For example, to view the running logs of the Storage service:
// View the name of the Storage service Pod, nebula-storaged-0.\n$ kubectl get pods -l app.kubernetes.io/component=storaged\nNAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 45h\n...\n\n// Enter the container storaged of the Storage service.\n$ kubectl exec -it nebula-storaged-0 -c storaged -- /bin/bash\n\n// View the running logs of the Storage service.\n$ cd /usr/local/nebula/logs\n
"},{"location":"k8s-operator/4.cluster-administration/4.5.logging/#clean_logs","title":"Clean logs","text":"Running logs generated by cluster services during runtime will occupy disk space. To avoid occupying too much disk space, the NebulaGraph Operator uses a sidecar container to periodically clean and archive logs.
To facilitate log collection and management, each NebulaGraph service deploys a sidecar container responsible for collecting logs generated by the service container and sending them to the specified log disk. The sidecar container automatically cleans and archives logs using the logrotate tool.
In the YAML configuration file of the cluster instance, set spec.logRotate
to enable log rotation and set timestamp_in_logfile_name
to false
to disable the timestamp in the log file name to implement log rotation for the target service. The timestamp_in_logfile_name
parameter is configured under the spec.<graphd|metad|storaged>.config
field. By default, the log rotation feature is turned off. Here is an example of enabling log rotation for all services:
...\nspec:\n graphd:\n config:\n # Whether to include a timestamp in the log file name. \n # You must set this parameter to false to enable log rotation. \n # It is set to true by default.\n \"timestamp_in_logfile_name\": \"false\"\n metad:\n config:\n \"timestamp_in_logfile_name\": \"false\"\n storaged:\n config:\n \"timestamp_in_logfile_name\": \"false\"\n logRotate: # Log rotation configuration\n # The number of times a log file is rotated before being deleted.\n # The default value is 5, and 0 means the log file will not be rotated before being deleted.\n rotate: 5\n # The log file is rotated only if it grows larger than the specified size. The default value is 200M.\n size: \"200M\"\n
"},{"location":"k8s-operator/4.cluster-administration/4.5.logging/#collect_logs","title":"Collect logs","text":"If you don't want to mount additional log disks to back up log files, or if you want to collect logs and send them to a log center using services like fluent-bit, you can configure logs to be output to standard error. The Operator uses the glog tool to log to standard error output.
Note
Currently, NebulaGraph Operator only collects standard error logs.
In the YAML configuration file of the cluster instance, you can configure logging to standard error output in the config
and env
fields of each service.
...\nspec:\n graphd:\n config:\n # Whether to redirect standard error to a separate output file. The default value is false, which means it is not redirected.\n redirect_stdout: \"false\"\n # The severity level of log content: INFO, WARNING, ERROR, and FATAL. The corresponding values are 0, 1, 2, and 3.\n stderrthreshold: \"0\"\n env: \n - name: GLOG_logtostderr # Write log to standard error output instead of a separate file.\n value: \"1\" # 1 represents writing to standard error output, and 0 represents writing to a file.\n image: vesoft/nebula-graphd\n replicas: 1\n resources:\n requests:\n cpu: 500m\n memory: 500Mi\n service:\n externalTrafficPolicy: Local\n type: NodePort\n version: vmaster\n metad:\n config:\n redirect_stdout: \"false\"\n stderrthreshold: \"0\"\n dataVolumeClaim:\n resources:\n requests:\n storage: 1Gi\n storageClassName: ebs-sc\n env:\n - name: GLOG_logtostderr\n value: \"1\"\n image: vesoft/nebula-metad\n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.1.cluster-install/","title":"Install a NebulaGraph cluster using NebulaGraph Operator","text":"Using NebulaGraph Operator to install NebulaGraph clusters enables automated cluster management with automatic error recovery. This topic covers two methods, kubectl apply
and helm
, for installing clusters using NebulaGraph Operator.
Historical version compatibility
NebulaGraph Operator versions 1.x are not compatible with NebulaGraph versions below 3.x.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.1.cluster-install/#prerequisites","title":"Prerequisites","text":"kubectl apply
","text":"Create a namespace for storing NebulaGraph cluster-related resources. For example, create the nebula
namespace.
kubectl create namespace nebula\n
Create a YAML configuration file nebulacluster.yaml
for the cluster. For example, create a cluster named nebula
.
nebula
cluster apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n # Control the Pod scheduling strategy.\n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n # Enable PV recycling.\n enablePVReclaim: false\n # Enable monitoring.\n exporter:\n image: vesoft/nebula-stats-exporter\n version: v3.3.0\n replicas: 1\n maxRequests: 20\n # Custom Agent image for cluster backup and restore, and log cleanup.\n agent:\n image: vesoft/nebula-agent\n version: latest\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\" \n # Configure the image pull policy.\n imagePullPolicy: Always\n # Select the nodes for Pod scheduling.\n nodeSelector:\n nebula: cloud\n # Dependent controller name.\n reference:\n name: statefulsets.apps\n version: v1\n # Scheduler name.\n schedulerName: default-scheduler \n # Start NebulaGraph Console service for connecting to the Graph service.\n console:\n image: vesoft/nebula-console\n version: nightly\n username: \"demo\"\n password: \"test\" \n # Graph service configuration. \n graphd:\n # Used to check if the Graph service is running normally.\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19669\n # scheme: HTTP\n # initialDelaySeconds: 40\n # periodSeconds: 10\n # successThreshold: 1\n # timeoutSeconds: 10\n # Container image for the Graph service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n # Storage class name for storing Graph service logs.\n storageClassName: local-sc\n # Number of replicas for the Graph service Pod.\n replicas: 1\n # Resource configuration for the Graph service.\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n # Version of the Graph service.\n version: vmaster\n # Custom flags configuration for the Graph service.\n config: {}\n # Meta service configuration.\n metad:\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19559\n # scheme: HTTP\n # initialDelaySeconds: 5\n # periodSeconds: 5\n # successThreshold: 1\n # timeoutSeconds: 5\n # Container image for the Meta service.\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n # Custom flags configuration for the Meta service.\n config: {} \n # Storage service configuration.\n storaged:\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19779\n # scheme: HTTP\n # initialDelaySeconds: 40\n # periodSeconds: 10\n # successThreshold: 1\n # timeoutSeconds: 5\n # Container image for the Storage service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n # Custom flags configuration for the Storage service.\n config: {} \n
Expand to view all configurable parameters and descriptions Parameter Default Value Description metadata.name
- The name of the created NebulaGraph cluster. spec.console
- Launches a Console container for connecting to the Graph service. For configuration details, see nebula-console. spec.topologySpreadConstraints
- Controls the scheduling strategy for Pods. For more details, see Topology Spread Constraints. When the value of topologyKey
is kubernetes.io/zone
, the value of whenUnsatisfiable
must be set to DoNotSchedule
, and the value of spec.schedulerName
should be nebula-scheduler
. spec.graphd.replicas
1
The number of replicas for the Graphd service. spec.graphd.image
vesoft/nebula-graphd
The container image for the Graphd service. spec.graphd.version
master
The version of the Graphd service. spec.graphd.service
Configuration for accessing the Graphd service via a Service. spec.graphd.logVolumeClaim.storageClassName
- The storage class name for the log volume claim of the Graphd service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.metad.replicas
1
The number of replicas for the Metad service. spec.metad.image
vesoft/nebula-metad
The container image for the Metad service. spec.metad.version
master
The version of the Metad service. spec.metad.dataVolumeClaim.storageClassName
- Storage configuration for the data disk of the Metad service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.metad.logVolumeClaim.storageClassName
- Storage configuration for the log disk of the Metad service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.storaged.replicas
3
The number of replicas for the Storaged service. spec.storaged.image
vesoft/nebula-storaged
The container image for the Storaged service. spec.storaged.version
master
The version of the Storaged service. spec.storaged.dataVolumeClaims.resources.requests.storage
- The storage size for the data disk of the Storaged service. You can specify multiple data disks. When specifying multiple data disks, the paths are like /usr/local/nebula/data1
, /usr/local/nebula/data2
, and so on. spec.storaged.dataVolumeClaims.storageClassName
- Storage configuration for the data disks of the Storaged service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.storaged.logVolumeClaim.storageClassName
- Storage configuration for the log disk of the Storaged service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.<metad|storaged|graphd>.securityContext
{}
Defines the permission and access control for the cluster containers to control access and execution of container operations. For details, see SecurityContext. spec.agent
{}
Configuration for the Agent service used for backup and recovery, and log cleaning functions. If you don't customize this configuration, the default configuration is used. spec.reference.name
{}
The name of the controller it depends on. spec.schedulerName
default-scheduler
The name of the scheduler. spec.imagePullPolicy
Always
The image pull policy for NebulaGraph images. For more details on pull policies, please see Image pull policy. spec.logRotate
{}
Log rotation configuration. For details, see Managing Cluster Logs. spec.enablePVReclaim
false
Defines whether to automatically delete PVCs after deleting the cluster to release data. For details, see Reclaim PV. spec.metad.licenseManagerURL
- Configures the URL pointing to the License Manager (LM), consisting of the access address and port (default port 9119
). For example, 192.168.8.xxx:9119
. For creating the NebulaGraph Enterprise Edition only. spec.storaged.enableAutoBalance
false
Whether to enable automatic balancing. For details, see Balancing Storage Data After Scaling Out. spec.enableBR
false
Defines whether to enable the BR tool. For details, see Backup and Restore. spec.imagePullSecrets
[]
Defines the Secret required to pull images from a private repository. Create the NebulaGraph cluster.
kubectl create -f nebulacluster.yaml -n nebula\n
Output:
nebulacluster.apps.nebula-graph.io/nebula created\n
If you don't specify the namespace using -n
, it will default to the default
namespace.
Check the status of the NebulaGraph cluster.
kubectl get nebulaclusters nebula -n nebula\n
Output:
NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\nnebula True 1 1 1 1 1 1 86s\n
helm
","text":"Add the NebulaGraph Operator Helm repository (if it's already added, run the next step directly).
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update the Helm repository to fetch the latest resources.
helm repo update nebula-operator\n
Set environment variables for the configuration parameters required for installing the cluster.
export NEBULA_CLUSTER_NAME=nebula # Name of the NebulaGraph cluster.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # Namespace for the NebulaGraph cluster.\nexport STORAGE_CLASS_NAME=local-sc # StorageClass for the NebulaGraph cluster.\n
Create a namespace for the NebulaGraph cluster if it is not created.
kubectl create namespace \"${NEBULA_CLUSTER_NAMESPACE}\"\n
Check the customizable configuration parameters for the nebula-cluster
Helm chart of the nebula-operator
when creating the cluster.
Run the following command to view all the configurable parameters.
helm show values nebula-operator/nebula-cluster\n
Example to view all configurable parameters nebula:\n version: master\n imagePullPolicy: Always\n storageClassName: \"\"\n enablePVReclaim: false\n enableBR: false\n enableForceUpdate: false\n schedulerName: default-scheduler \n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n logRotate: {}\n reference:\n name: statefulsets.apps\n version: v1\n graphd:\n image: vesoft/nebula-graphd\n replicas: 2\n serviceType: NodePort\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"500Mi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n metad:\n image: vesoft/nebula-metad\n replicas: 3\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n dataVolume:\n storage: \"2Gi\"\n licenseManagerURL: \"\"\n license: {}\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n storaged:\n image: vesoft/nebula-storaged\n replicas: 3\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n dataVolumes:\n - storage: \"10Gi\"\n enableAutoBalance: false\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n exporter:\n image: vesoft/nebula-stats-exporter\n version: v3.3.0\n replicas: 1\n env: []\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\"\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n maxRequests: 20\n\n agent:\n image: vesoft/nebula-agent\n version: latest\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\"\n\n console:\n username: root\n password: nebula\n image: vesoft/nebula-console\n version: latest\n nodeSelector: {}\n\n alpineImage: \"\"\n\nimagePullSecrets: []\nnameOverride: \"\"\nfullnameOverride: \"\" \n
Expand to view parameter descriptions Parameter Default Value Description nebula.version
master Version of the cluster. nebula.imagePullPolicy
Always
Container image pull policy. Always
means always attempting to pull the latest image from the remote. nebula.storageClassName
\"\"
Name of the Kubernetes storage class for dynamic provisioning of persistent volumes. nebula.enablePVReclaim
false
Enable persistent volume reclaim. See Reclaim PV for details. nebula.enableBR
false
Enable the backup and restore feature. See Backup and Restore with NebulaGraph Operator for details. nebula.enableForceUpdate
false
Force update the Storage service without transferring the leader partition replicas. See Optimize leader transfer in rolling updates for details. nebula.schedulerName
default-scheduler
Name of the Kubernetes scheduler. Must be configured as nebula-scheduler
when using the Zone feature. nebula.topologySpreadConstraints
[]
Control the distribution of pods in the cluster. nebula.logRotate
{}
Log rotation configuration. See Manage cluster logs for details. nebula.reference
{\"name\": \"statefulsets.apps\", \"version\": \"v1\"}
The workload referenced for a NebulaGraph cluster. nebula.graphd.image
vesoft/nebula-graphd
Container image for the Graph service. nebula.graphd.replicas
2
Number of replicas for the Graph service. nebula.graphd.serviceType
NodePort
Service type for the Graph service, defining how the Graph service is accessed. See Connect to the Cluster for details. nebula.graphd.env
[]
Container environment variables for the Graph service. nebula.graphd.config
{}
Configuration for the Graph service. See Customize the configuration of the NebulaGraph cluster for details. nebula.graphd.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"500Mi\"}}}
Resource limits and requests for the Graph service. nebula.graphd.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Graph service. When enable
is false
, log volume is not used. nebula.metad.image
vesoft/nebula-metad
Container image for the Meta service. nebula.metad.replicas
3
Number of replicas for the Meta service. nebula.metad.env
[]
Container environment variables for the Meta service. nebula.metad.config
{}
Configuration for the Meta service. See Customize the configuration of the NebulaGraph cluster for details. nebula.metad.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}}}
Resource limits and requests for the Meta service. nebula.metad.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Meta service. When enable
is false
, log volume is not used. nebula.metad.dataVolume
{\"dataVolume\": {\"storage\": \"2Gi\"}}
Data storage configuration for the Meta service. nebula.metad.licenseManagerURL
\"\"
URL for the license manager (LM) to obtain license information. For creating the NebulaGraph Enterprise Edition only. nebula.storaged.image
vesoft/nebula-storaged
Container image for the Storage service. nebula.storaged.replicas
3
Number of replicas for the Storage service. nebula.storaged.env
[]
Container environment variables for the Storage service. nebula.storaged.config
{}
Configuration for the Storage service. See Customize the configuration of the NebulaGraph cluster for details. nebula.storaged.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}}}
Resource limits and requests for the Storage service. nebula.storaged.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Storage service. When enable
is false
, log volume is not used. nebula.storaged.dataVolumes
{\"dataVolumes\": [{\"storage\": \"10Gi\"}]}
Data storage configuration for the Storage service. Supports specifying multiple data volumes. nebula.storaged.enableAutoBalance
false
Enable automatic balancing. See Balance storage data after scaling out for details. nebula.exporter.image
vesoft/nebula-stats-exporter
Container image for the Exporter service. nebula.exporter.version
v3.3.0
Version of the Exporter service. nebula.exporter.replicas
1
Number of replicas for the Exporter service. nebula.exporter.env
[]
Environment variables for the Exporter service. nebula.exporter.resources
{\"resources\":{\"requests\":{\"cpu\":\"100m\",\"memory\":\"128Mi\"},\"limits\":{\"cpu\":\"200m\",\"memory\":\"256Mi\"}}}
Resource limits and requests for the Exporter service. nebula.agent.image
vesoft/nebula-agent
Container image for the agent service. nebula.agent.version
latest
Version of the agent service. nebula.agent.resources
{\"resources\":{\"requests\":{\"cpu\":\"100m\",\"memory\":\"128Mi\"},\"limits\":{\"cpu\":\"200m\",\"memory\":\"256Mi\"}}}
Resource limits and requests for the agent service. nebula.console.username
root
Username for accessing the NebulaGraph Console client. See Connect to the cluster for details. nebula.console.password
nebula
Password for accessing the NebulaGraph Console client. nebula.console.image
vesoft/nebula-console
Container image for the NebulaGraph Console client. nebula.console.version
latest
Version of the NebulaGraph Console client. nebula.alpineImage
\"\"
Alpine Linux container image used to obtain zone information for nodes. imagePullSecrets
[]
Names of secrets to pull private images. nameOverride
\"\"
Cluster name. fullnameOverride
\"\"
Name of the released chart instance. nebula.<graphd|metad|storaged|exporter>.podLabels
{}
Additional labels to be added to the pod. nebula.<graphd|metad|storaged|exporter>.podAnnotations
{}
Additional annotations to be added to the pod. nebula.<graphd|metad|storaged|exporter>.securityContext
{}
Security context for setting pod-level security attributes, including user ID, group ID, Linux Capabilities, etc. nebula.<graphd|metad|storaged|exporter>.nodeSelector
{}
Label selectors for determining which nodes to run the pod on. nebula.<graphd|metad|storaged|exporter>.tolerations
[]
Tolerations allow a pod to be scheduled to nodes with specific taints. nebula.<graphd|metad|storaged|exporter>.affinity
{}
Affinity rules for the pod, including node affinity, pod affinity, and pod anti-affinity. nebula.<graphd|metad|storaged|exporter>.readinessProbe
{}
Probe to check if a container is ready to accept service requests. When the probe returns success, traffic can be routed to the container. nebula.<graphd|metad|storaged|exporter>.livenessProbe
{}
Probe to check if a container is still running. If the probe fails, Kubernetes will kill and restart the container. nebula.<graphd|metad|storaged|exporter>.initContainers
[]
Special containers that run before the main application container starts, typically used for setting up the environment or initializing data. nebula.<graphd|metad|storaged|exporter>.sidecarContainers
[]
Containers that run alongside the main application container, typically used for auxiliary tasks such as log processing, monitoring, etc. nebula.<graphd|metad|storaged|exporter>.volumes
[]
Storage volumes to be attached to the service pod. nebula.<graphd|metad|storaged|exporter>.volumeMounts
[]
Specifies where to mount the storage volume inside the container. Create the NebulaGraph cluster.
You can use the --set
flag to customize the default values of the NebulaGraph cluster configuration. For example, --set nebula.storaged.replicas=3
sets the number of replicas for the Storage service to 3.
helm install \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\ \n # Specify the version of the cluster chart. If not specified, it will install the latest version by default.\n # You can check all chart versions by running the command: helm search repo -l nebula-operator/nebula-cluster\n --version=1.8.0 \\\n # Specify the namespace for the NebulaGraph cluster.\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n # Customize the cluster name.\n --set nameOverride=\"${NEBULA_CLUSTER_NAME}\" \\\n --set nebula.storageClassName=\"${STORAGE_CLASS_NAME}\" \\\n # Specify the version for the NebulaGraph cluster.\n --set nebula.version=vmaster\n
Check the status of NebulaGraph cluster pods.
kubectl -n \"${NEBULA_CLUSTER_NAMESPACE}\" get pod -l \"app.kubernetes.io/cluster=${NEBULA_CLUSTER_NAME}\"\n
Output:
NAME READY STATUS RESTARTS AGE\nnebula-exporter-854c76989c-mp725 1/1 Running 0 14h\nnebula-graphd-0 1/1 Running 0 14h\nnebula-graphd-1 1/1 Running 0 14h\nnebula-metad-0 1/1 Running 0 14h\nnebula-metad-1 1/1 Running 0 14h\nnebula-metad-2 1/1 Running 0 14h\nnebula-storaged-0 1/1 Running 0 14h\nnebula-storaged-1 1/1 Running 0 14h\nnebula-storaged-2 1/1 Running 0 14h\n
This topic introduces how to upgrade a NebulaGraph cluster created with NebulaGraph Operator.
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.2.cluster-upgrade/#limits","title":"Limits","text":"You have created a NebulaGraph cluster. For details, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.2.cluster-upgrade/#upgrade_a_nebulagraph_cluster_with_kubectl","title":"Upgrade a NebulaGraph cluster withkubectl
","text":"The following steps upgrade a NebulaGraph cluster from version 3.5.0
to master
.
Check the image version of the services in the cluster.
kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n
Output:
1 vesoft/nebula-graphd:3.5.0\n 1 vesoft/nebula-metad:3.5.0\n 3 vesoft/nebula-storaged:3.5.0 \n
Edit the nebula
cluster configuration to change the version
value of the cluster services from 3.5.0 to master.
Open the YAML file for the nebula
cluster.
kubectl edit nebulacluster nebula -n <namespace>\n
Change the value of version
.
After making these changes, the YAML file should look like this:
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\nspec:\n graphd:\n version: master // Change the value from 3.5.0 to master.\n ...\n metad:\n version: master // Change the value from 3.5.0 to master.\n ...\n storaged:\n version: master // Change the value from 3.5.0 to master.\n ...\n
Apply the configuration.
After saving the YAML file and exiting, Kubernetes automatically updates the cluster's configuration and starts the cluster upgrade.
After waiting for about 2 minutes, run the following command to see if the image versions of the services in the cluster have been changed to master.
kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n
Output:
1 vesoft/nebula-graphd:master\n 1 vesoft/nebula-metad:master\n 3 vesoft/nebula-storaged:master \n
helm
","text":"Update the information of available charts locally from chart repositories.
helm repo update\n
Set environment variables to your desired values.
export NEBULA_CLUSTER_NAME=nebula # The desired NebulaGraph cluster name.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # The desired namespace where your NebulaGraph cluster locates.\n
Upgrade a NebulaGraph cluster.
For example, upgrade a cluster to master.
helm upgrade \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n --set nameOverride=${NEBULA_CLUSTER_NAME} \\\n --set nebula.version=master\n
The value of --set nebula.version
specifies the version of the cluster you want to upgrade to.
Run the following command to check the status and version of the upgraded cluster.
Check cluster status:
$ kubectl -n \"${NEBULA_CLUSTER_NAMESPACE}\" get pod -l \"app.kubernetes.io/cluster=${NEBULA_CLUSTER_NAME}\"\nNAME READY STATUS RESTARTS AGE\nnebula-graphd-0 1/1 Running 0 2m\nnebula-graphd-1 1/1 Running 0 2m\nnebula-metad-0 1/1 Running 0 2m\nnebula-metad-1 1/1 Running 0 2m\nnebula-metad-2 1/1 Running 0 2m\nnebula-storaged-0 1/1 Running 0 2m\nnebula-storaged-1 1/1 Running 0 2m\nnebula-storaged-2 1/1 Running 0 2m\n
Check cluster version:
$ kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n 1 vesoft/nebula-graphd:master\n 1 vesoft/nebula-metad:master\n 3 vesoft/nebula-storaged:master\n
The upgrade process of a cluster is a rolling update process and can be time-consuming due to the state transition of the leader partition replicas in the Storage service. You can configure the enableForceUpdate
field in the cluster instance's YAML file to skip the leader partition replica transfer operation, thereby accelerating the upgrade process. For more information, see Specify a rolling update strategy.
If you encounter issues during the upgrade process, you can check the logs of the cluster service pods.
kubectl logs <pod-name> -n <namespace>\n
Additionally, you can inspect the cluster's status and events.
kubectl describe nebulaclusters <cluster-name> -n <namespace>\n
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.3.cluster-uninstall/","title":"Delete a NebulaGraph cluster","text":"This topic explains how to delete a NebulaGraph cluster created using NebulaGraph Operator.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.3.cluster-uninstall/#usage_limitations","title":"Usage limitations","text":"kubectl
","text":"View all created clusters.
kubectl get nc --all-namespaces\n
Example output:
NAMESPACE NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\ndefault nebula True 2 2 3 3 3 3 38h\nnebula nebula2 True 1 1 1 1 1 1 2m7s\n
Delete a cluster. For example, run the following command to delete a cluster named nebula2
:
kubectl delete nc nebula2 -n nebula\n
Example output:
nebulacluster.nebula-graph.io \"nebula2\" deleted\n
Confirm the deletion.
kubectl get nc nebula2 -n nebula\n
Example output:
No resources found in nebula namespace.\n
helm
","text":"View all Helm releases.
helm list --all-namespaces\n
Example output:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION\nnebula default 1 2023-11-06 20:16:07.913136377 +0800 CST deployed nebula-cluster-1.7.1 1.7.1\nnebula-operator nebula-operator-system 3 2023-11-06 12:06:24.742397418 +0800 CST deployed nebula-operator-1.7.1 1.7.1\n
View detailed information about a Helm release. For example, to view the cluster information for a Helm release named nebula
:
helm get values nebula -n default\n
Example output:
USER-SUPPLIED VALUES:\nimagePullSecrets:\n- name: secret_for_pull_image\nnameOverride: nebula # The cluster name\nnebula:\n graphd:\n image: reg.vesoft-inc.com/xx\n metad:\n image: reg.vesoft-inc.com/xx\n licenseManagerURL: xxx:9119\n storageClassName: local-sc\n storaged:\n image: reg.vesoft-inc.com/xx\n version: v1.8.0 # The cluster version\n
Uninstall a Helm release. For example, to uninstall a Helm release named nebula
:
helm uninstall nebula -n default\n
Example output:
release \"nebula\" uninstalled\n
Once the Helm release is uninstalled, NebulaGraph Operator will automatically remove all K8s resources associated with that release.
Verify that the cluster resources are removed.
kubectl get nc nebula -n default\n
Example output:
No resources found in default namespace.\n
Local Persistent Volumes, abbreviated as Local PVs in K8s store container data directly using the node's local disk directory. Compared with network storage, Local Persistent Volumes provide higher IOPS and lower read and write latency, which is suitable for data-intensive applications. This topic introduces how to use Local PVs in Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS) clusters, and how to enable automatic failover for Local PVs in the cloud.
While using Local Persistent Volumes can enhance performance, it's essential to note that, unlike network storage, local storage does not support automatic backup. In the event of a node failure, all data in local storage may be lost. Therefore, the utilization of Local Persistent Volumes involves a trade-off between service availability, data persistence, and flexibility.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.1.use-local-pv/#principles","title":"Principles","text":"NebulaGraph Operator implements a Storage Volume Provisioner interface to automatically create and delete PV objects. Utilizing the provisioner, you can dynamically generate Local PVs as required. Based on the PVC and StorageClass specified in the cluster configuration file, NebulaGraph Operator automatically generates PVCs and associates them with their respective Local PVs.
When a Local PV is initiated by the provisioner interface, the provisioner controller generates a local
type PV and configures the nodeAffinity
field. This configuration ensures that Pods using the local
type PV are scheduled onto specific nodes. Conversely, when a Local PV is deleted, the provisioner controller eliminates the local
type PV object and purges the node's storage resources.
NebulaGraph Operator is installed. For details, see Install NebulaGraph Operator.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.1.use-local-pv/#steps","title":"Steps","text":"The resources in the following examples are all created in the default
namespace.
Create a node pool with local SSDs if not existing
gcloud container node-pools create \"pool-1\" --cluster \"gke-1\" --region us-central1 --node-version \"1.27.10-gke.1055000\" --machine-type \"n2-standard-2\" --local-nvme-ssd-block count=2 --max-surge-upgrade 1 --max-unavailable-upgrade 0 --num-nodes 1 --enable-autoscaling --min-nodes 1 --max-nodes 2\n
For information about the parameters to create a node pool with local SSDs, see Create a node pool with Local SSD.
Format and mount the local SSDs using a DaemonSet.
Download the gke-daemonset-raid-disks.yaml file.
Deploy the RAID disks DaemonSet. The DaemonSet sets a RAID 0
array on all Local SSD disks and formats the device to an ext4
filesystem.
kubectl apply -f gke-daemonset-raid-disks.yaml\n
Deploy the Local PV provisioner.
kubectl apply -f local-pv-provisioner.yaml\n
In the NebulaGraph cluster configuration file, specify spec.storaged.dataVolumeClaims
or spec.metad.dataVolumeClaim
, and the StorageClass needs to be configured as local-nvme
. For more information about cluster configurations, see Create a NebulaGraph cluster.
...\nmetad: \n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme\nstoraged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme \n...\n
After the NebulaGraph is deployed, the Local PVs are automatically created.
View the PV list.
kubectl get pv\n
Return:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE\npvc-01be9b75-9c50-4532-8695-08e11b489718 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 local-nvme 3m35s\npvc-09de8eb1-1225-4025-b91b-fbc0bcce670f 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-1 local-nvme 3m35s\npvc-4b2a9ffb-9000-4998-a7bb-edb825c872cb 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-2 local-nvme 3m35s\n...\n
View the detailed information of the PV.
kubectl get pv pvc-01be9b75-9c50-4532-8695-08e11b489718 -o yaml\n
Return:
apiVersion: v1\nkind: PersistentVolume\nmetadata:\n annotations:\n local.pv.provisioner/selected-node: gke-snap-test-snap-test-591403a8-xdfc\n nebula-graph.io/pod-name: nebula-storaged-0\n pv.kubernetes.io/provisioned-by: nebula-cloud.io/local-pv\n creationTimestamp: \"2024-03-05T06:12:32Z\"\n finalizers:\n - kubernetes.io/pv-protection\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: storaged\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: pvc-01be9b75-9c50-4532-8695-08e11b489718\n resourceVersion: \"9999469\"\n uid: ee28a4da-6026-49ac-819b-2075154b4724\nspec:\n accessModes:\n - ReadWriteOnce\n capacity:\n storage: 5Gi\n claimRef:\n apiVersion: v1\n kind: PersistentVolumeClaim\n name: storaged-data-nebula-storaged-0\n namespace: default\n resourceVersion: \"9996541\"\n uid: 01be9b75-9c50-4532-8695-08e11b489718\n local:\n fsType: ext4\n path: /mnt/disks/raid0\n nodeAffinity:\n required:\n nodeSelectorTerms:\n - matchExpressions:\n - key: kubernetes.io/hostname\n operator: In\n values:\n - gke-snap-test-snap-test-591403a8-xdfc\n persistentVolumeReclaimPolicy: Delete\n storageClassName: local-nvme\n volumeMode: Filesystem\nstatus:\n phase: Bound \n
Create a node pool with Instance Store if not existing.
eksctl create nodegroup --instance-types m5ad.2xlarge --nodes 3 --cluster eks-1\n
For more information about parameters to cluster node pools, see Creating a managed node group.
Format and mount the local SSDs using a DaemonSet.
Download the eks-daemonset-raid-disks.yaml file.
Based on the node type created in step 1, modify the value of the nodeSelector.node.kubernetes.io/instance-type
field in the eks-daemonset-raid-disks.yaml
file as needed.
spec:\n nodeSelector:\n node.kubernetes.io/instance-type: \"m5ad.2xlarge\"\n
Install nvme-cli.
sudo apt-get update\nsudo apt-get install -y nvme-cli\n
sudo yum install -y nvme-cli\n
Deploy the RAID disk DaemonSet. The DaemonSet sets up a RAID 0
array on all local SSD disks and formats the devices as an ext4
file system.
kubectl apply -f gke-daemonset-raid-disks.yaml\n
Deploy the Local PV provisioner.
kubectl apply -f local-pv-provisioner.yaml\n
In the NebulaGraph cluster configuration file, specify spec.storaged.dataVolumeClaims
or spec.metad.dataVolumeClaim
, and the StorageClass needs to be configured as local-nvme
. For more information about cluster configurations, see Create a NebulaGraph cluster.
metad:\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme\nstoraged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme \n
View the PV list.
kubectl get pv\n
Return:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE\npvc-290c15cc-a302-4463-a591-84b7217a6cd2 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 local-nvme 3m40s\npvc-fbb3167f-f556-4a16-ae0e-171aed0ac954 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-1 local-nvme 3m40s\npvc-6c7cfe80-0134-4573-b93e-9b259c6fcd63 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-2 local-nvme 3m40s\n...\n
View the detailed information of the PV.
kubectl get pv pvc-290c15cc-a302-4463-a591-84b7217a6cd2 -o yaml\n
Return:
apiVersion: v1\nkind: PersistentVolume\nmetadata:\n annotations:\n local.pv.provisioner/selected-node: ip-192-168-77-60.ec2.internal\n nebula-graph.io/pod-name: nebula-storaged-0\n pv.kubernetes.io/provisioned-by: nebula-cloud.io/local-pv\n creationTimestamp: \"2024-03-04T07:51:32Z\"\n finalizers:\n - kubernetes.io/pv-protection\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: storaged\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: pvc-290c15cc-a302-4463-a591-84b7217a6cd2\n resourceVersion: \"7932689\"\n uid: 66c0a2d3-2914-43ad-93b5-6d84fb62acef\nspec:\n accessModes:\n - ReadWriteOnce\n capacity:\n storage: 5Gi\n claimRef:\n apiVersion: v1\n kind: PersistentVolumeClaim\n name: storaged-data-nebula-storaged-0\n namespace: default\n resourceVersion: \"7932688\"\n uid: 8ecb5d96-004b-4672-bac4-1355ae15eae4\n local:\n fsType: ext4\n path: /mnt/disks/raid0\n nodeAffinity:\n required:\n nodeSelectorTerms:\n - matchExpressions:\n - key: kubernetes.io/hostname\n operator: In\n values:\n - ip-192-168-77-60.ec2.internal\n persistentVolumeReclaimPolicy: Delete\n storageClassName: local-nvme\n volumeMode: Filesystem\nstatus:\n phase: Bound \n
When using network storage (e.g., AWS EBS, Google Cloud Persistent Disk, Azure Disk Storage, Ceph, NFS, etc.) as a PV, the storage resource is independent of any particular node. Therefore, the storage resource can be mounted and used by Pods regardless of the node to which the Pods are scheduled. However, when using a local storage disk as a PV, the storage resource can only be used by Pods on a specific node due to nodeAffinity.
The Storage service of NebulaGraph supports data redundancy, which allows you to set multiple odd-numbered partition replicas. When a node fails, the associated partition is automatically transferred to a healthy node. However, Storage Pods using Local Persistent Volumes cannot run on other nodes due to the node affinity setting and must wait for the node to recover. To run on another node, the Pods must be unbound from the associated Local Persistent Volume.
NebulaGraph Operator supports automatic failover in the event of a node failure while using Local Persistent Volumes in the cloud for elastic scaling. This is achieved by setting spec.enableAutoFailover
to true
in the cluster configuration file, which automatically unbinds the Pods from the Local Persistent Volume, allowing the Pods to run on another node.
Example configuration:
...\nspec:\n # Enable automatic failover for Local PV.\n enableAutoFailover: true\n # The time to wait for the Storage service to be in the `OFFLINE` status\n # before automatic failover. \n # The default value is 5 minutes.\n # If the Storage service recovers to the `ONLINE` status during this period,\n # failover will not be triggered.\n failoverPeriod: \"2m\"\n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.2.pv-expansion/","title":"Dynamically expand persistent volumes","text":"In a Kubernetes environment, NebulaGraph's data is stored on Persistent Volumes (PVs). Dynamic volume expansion refers to increasing the capacity of a volume without stopping the service, enabling NebulaGraph to accommodate growing data. This topic explains how to dynamically expand the PV for NebulaGraph services in a Kubernetes environment.
Note
In Kubernetes, a StorageClass is a resource that defines a particular storage type. It describes a class of storage, including its provisioner, parameters, and other details. When creating a PersistentVolumeClaim (PVC) and specifying a StorageClass, Kubernetes automatically creates a corresponding PV. The principle of dynamic volume expansion is to edit the PVC and increase the volume's capacity. Kubernetes will then automatically expand the capacity of the PV associated with this PVC based on the specified storageClassName
in the PVC. During this process, new PVs are not created; the size of the existing PV is changed. Only dynamic storage volumes, typically those associated with a storageClassName
, support dynamic volume expansion. Additionally, the allowVolumeExpansion
field in the StorageClass must be set to true
. For more details, see the Kubernetes documentation on expanding Persistent Volume Claims.
In NebulaGraph Operator, you cannot directly edit PVC because Operator automatically creates PVC based on the configuration in the spec.<metad|storaged>.dataVolumeClaim
of the Nebula Graph cluster. Therefore, you need to modify the cluster's configuration to update the PVC and trigger dynamic online volume expansion for the PV.
allowVolumeExpansion
field in the StorageClass is set to true
.provisioner
configured in the StorageClass supports dynamic expansion.In the following example, we assume that the StorageClass is named ebs-sc
and the NebulaGraph cluster is named nebula
. We will demonstrate how to dynamically expand the PV for the Storage service.
Check the status of the Storage service Pod:
kubectl get pod\n
Example output:
nebula-storaged-0 1/1 Running 0 43h\n
Check the PVC and PV information for the Storage service:
# View PVC \nkubectl get pvc\n
Example output:
storaged-data-nebula-storaged-0 Bound pvc-36ca3871-9265-460f-b812-7e73a718xxxx 5Gi RWO ebs-sc 43h\n
# View PV and confirm that the capacity of the PV is 5Gi\nkubectl get pv\n
Example output:
pvc-36ca3871-9265-460f-b812-xxx 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 ebs-sc 43h\n
Assuming all the above-mentioned prerequisites are met, use the following command to request an expansion of the PV for the Storage service to 10Gi:
kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"10Gi\"}}, \"storageClassName\": \"ebs-sc\"}]}}}'\n
Example output:
nebulacluster.apps.nebula-graph.io/nebula patched\n
After waiting for about a minute, check the expanded PVC and PV information:
kubectl get pvc\n
Example output:
storaged-data-nebula-storaged-0 Bound pvc-36ca3871-9265-460f-b812-7e73a718xxxx 10Gi RWO ebs-sc 43h\n
kubectl get pv\n
Example output:
pvc-36ca3871-9265-460f-b812-xxx 10Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 ebs-sc 43h\n
As you can see, both the PVC and PV capacity have been expanded to 10Gi.
NebulaGraph Operator uses PVs (Persistent Volumes) and PVCs (Persistent Volume Claims) to store persistent data. If you accidentally deletes a NebulaGraph cluster, by default, PV and PVC objects and the relevant data will be retained to ensure data security.
You can also define the automatic deletion of PVCs to release data by setting the parameter spec.enablePVReclaim
to true
in the configuration file of the cluster instance. As for whether PV will be deleted automatically after PVC is deleted, you need to customize the PV reclaim policy. See reclaimPolicy in StorageClass and PV Reclaiming for details.
A NebulaGraph cluster is created in Kubernetes. For specific steps, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.3.configure-pv-reclaim/#steps","title":"Steps","text":"The following example uses a cluster named nebula
and the cluster's configuration file named nebula_cluster.yaml
to show how to set enablePVReclaim
:
Run the following command to edit the nebula
cluster's configuration file.
kubectl edit nebulaclusters.apps.nebula-graph.io nebula\n
Add enablePVReclaim
and set its value to true
under spec
.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\nspec:\n enablePVReclaim: true //Set its value to true.\n graphd:\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n imagePullPolicy: IfNotPresent\n metad:\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n nodeSelector:\n nebula: cloud\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n storaged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n - resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n image: vesoft/nebula-storaged\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 3\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n... \n
Run kubectl apply -f nebula_cluster.yaml
to push your configuration changes to the cluster.
After setting enablePVReclaim
to true
, the PVCs of the cluster will be deleted automatically after the cluster is deleted. If you want to delete the PVs, you need to set the reclaim policy of the PVs to Delete
.
Kubernetes Admission Control is a security mechanism running as a webhook at runtime. It intercepts and modifies requests to ensure the cluster's security. Admission webhooks involve two main operations: validation and mutation. NebulaGraph Operator supports only validation operations and provides some default admission control rules. This topic describes NebulaGraph Operator's default admission control rules and how to enable admission control.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.2.enable-admission-control/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created with NebulaGrpah Operator. For detailed steps, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.2.enable-admission-control/#admission_control_rules","title":"Admission control rules","text":"Kubernetes admission control allows you to insert custom logic or policies before Kubernetes API Server processes requests. This mechanism can be used to implement various security policies, such as restricting a Pod's resource consumption or limiting its access permissions. NebulaGraph Operator supports validation operations, which means it validates and intercepts requests without making changes.
After admission control is enabled, NebulaGraph Operator implements the following admission validation control rules by default. You cannot disable these rules:
dataVolumeClaims
.After admission control is enabled, NebulaGraph Operator allows you to add annotations to implement the following admission validation control rules:
Clusters with the ha-mode
annotation must have the minimum number of replicas as required by high availability mode:
Note
High availability mode refers to the high availability of NebulaGraph cluster services. Storage and Meta services are stateful, and the number of replicas should be an odd number due to Raft protocol requirements for data consistency. In high availability mode, at least 3 Storage services and 3 Meta services are required. Graph services are stateless, so their number of replicas can be even but should be at least 2.
delete-protection
annotation cannot be deleted. For more information, see Configure deletion protection. To ensure secure communication and data integrity between the K8s API server and the admission webhook, this communication is done over HTTPS by default. This means that TLS certificates are required for the admission webhook. cert-manager is a Kubernetes certificate management controller that automates the issuance and renewal of certificates. NebulaGraph Operator uses cert-manager to manage certificates.
Once cert-manager is installed and admission control is enabled, NebulaGraph Operator will automatically create an Issuer for issuing the necessary certificate for the admission webhook, and a Certificate for storing the issued certificate. The issued certificate is stored in the nebula-operator-webhook-secret
Secret.
Install cert-manager.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.yaml\n
It is suggested to deploy the latest version of cert-manager. For details, see the official cert-manager documentation.
Modify the NebulaGraph Operator configuration file to enable admission control. Admission control is disabled by default and needs to be enabled manually.
# Check the current configuration\nhelm show values nebula-operator/nebula-operator\n
# Modify the configuration by setting `enableAdmissionWebhook` to `true`.\nhelm upgrade nebula-operator nebula-operator/nebula-operator --set enableAdmissionWebhook=true\n
Note
nebula-operator
is the name of the chart repository, and nebula-operator/nebula-operator
is the chart name. If the chart's namespace is not specified, it defaults to default
.
View the certificate Secret for the admission webhook.
kubectl get secret nebula-operator-webhook-secret -o yaml\n
If the output includes certificate contents, it means that the admission webhook's certificate has been successfully created.
Verify the control rules.
Verify preventing additional PVs from being added to Storage service.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"2Gi\"}}, \"storageClassName\": \"local-path\"},{\"resources\": {\"requests\": {\"storage\": \"3Gi\"}}, \"storageClassName\": \"fask-disks\"}]}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" deniedthe request: spec.storaged.dataVolumeClaims: Forbidden: storaged dataVolumeClaims is immutable\n
Verify disallowing shrinking Storage service's PVC capacity.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"1Gi\"}}, \"storageClassName\": \"fast-disks\"}]}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: spec.storaged.dataVolumeClaims: Invalid value: resource.Quantity{i:resource.int64Amount{value:1073741824, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"1Gi\", Format:\"BinarySI\"}: data volume size can only be increased\n
Verify disallowing any secondary operation during Storage service scale-in.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"replicas\": 5}}}'\nnebulacluster.apps.nebula-graph.io/nebula patched\n$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"replicas\": 3}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: [spec.storaged: Forbidden: field is immutable while in ScaleOut phase, spec.storaged.replicas: Invalid value: 3: field is immutable while not in Running phase]\n
Verify the minimum number of replicas in high availability mode.
# Annotate the cluster to enable high availability mode.\n$ kubectl annotate nc nebula nebula-graph.io/ha-mode=true\n# Verify the minimum number of the Graph service's replicas.\n$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"graphd\": {\"replicas\":1}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: spec.graphd.replicas: Invalid value: 1: should be at least 2 in HA mode\n
NebulaGraph Operator supports deletion protection to prevent NebulaGraph clusters from being deleted by accident. This topic describes how to configure deletion protection for a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#prerequisites","title":"Prerequisites","text":"Add the delete-protection
annotation to the cluster.
kubectl annotate nc nebula -n nebula-test nebula-graph.io/delete-protection=true\n
The preceding command enables deletion protection for the nebula
cluster in the nebula-test
namespace."},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#verify_deletion_protection","title":"Verify deletion protection","text":"To verify that deletion protection is enabled, run the following command:
kubectl delete nc nebula -n nebula-test\n
The preceding command attempts to delete the nebula
cluster in the nebula-test
namespace.
Return:
Error from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: metadata.annotations[nebula-graph.io/delete-protection]: Forbidden: protected cluster cannot be deleted\n
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#remove_the_annotation_to_disable_deletion_protection","title":"Remove the annotation to disable deletion protection","text":"Remove the delete-protection
annotation from the cluster as follows:
kubectl annotate nc nebula -n nebula-test nebula-graph.io/delete-protection-\n
The preceding command disables deletion protection for the nebula
cluster in the nebula-test
namespace.
NebulaGraph Operator calls the interface provided by NebulaGraph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a NebulaGraph cluster stops running), NebulaGraph Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.8.ha-and-balancing/4.8.1.self-healing/#prerequisites","title":"Prerequisites","text":"Install NebulaGraph Operator
"},{"location":"k8s-operator/4.cluster-administration/4.8.ha-and-balancing/4.8.1.self-healing/#steps","title":"Steps","text":"Create a NebulaGraph cluster. For more information, see Create a NebulaGraph clusters.
Delete the Pod named <cluster_name>-storaged-2
after all pods are in the Running
status.
kubectl delete pod <cluster-name>-storaged-2 --now\n
<cluster_name>
is the name of your NebulaGraph cluster. NebulaGraph Operator automates the creation of the Pod named <cluster-name>-storaged-2
to perform self-healing.
Run the kubectl get pods
command to check the status of the Pod <cluster-name>-storaged-2
.
...\nnebula-cluster-storaged-1 1/1 Running 0 5d23h\nnebula-cluster-storaged-2 0/1 ContainerCreating 0 1s\n...\n
...\nnebula-cluster-storaged-1 1/1 Running 0 5d23h\nnebula-cluster-storaged-2 1/1 Running 0 4m2s\n...\n
When the status of <cluster-name>-storaged-2
is changed from ContainerCreating
to Running
, the self-healing is performed successfully. NebulaGraph clusters use a distributed architecture to divide data into multiple logical partitions, which are typically evenly distributed across different nodes. In distributed systems, there are usually multiple replicas of the same data. To ensure the consistency of data across multiple replicas, NebulaGraph clusters use the Raft protocol to synchronize multiple partition replicas. In the Raft protocol, each partition elects a leader replica, which is responsible for handling write requests, while follower replicas handle read requests.
When a NebulaGraph cluster created by NebulaGraph Operator performs a rolling update, a storage node temporarily stops providing services for the update. For an overview of rolling updates, see Performing a Rolling Update. If the node hosting the leader replica stops providing services, it will result in the unavailability of read and write operations for that partition. To avoid this situation, by default, NebulaGraph Operator transfers the leader replicas to other unaffected nodes during the rolling update process of a NebulaGraph cluster. This way, when a storage node is being updated, the leader replicas on other nodes can continue processing client requests, ensuring the read and write availability of the cluster.
The process of migrating all leader replicas from one storage node to the other nodes may take a long time. To better control the rolling update duration, Operator provides a field called enableForceUpdate
. When it is confirmed that there is no external access traffic, you can set this field to true
. This way, the leader replicas will not be transferred to other nodes, thereby speeding up the rolling update process.
Operator triggers a rolling update of the NebulaGraph cluster under the following circumstances:
In the YAML file for creating a cluster instance, add the spec.storaged.enableForceUpdate
field and set it to true
or false
to control the rolling update speed.
When enableForceUpdate
is set to true
, it means that the leader partition replicas are not transferred, thus speeding up the rolling update process. Conversely, when set to false
, it means that the leader replicas are transferred to other nodes to ensure the read and write availability of the cluster. The default value is false
.
Warning
When setting enableForceUpdate
to true
, make sure there is no traffic entering the cluster for read and write operations. This is because this setting will force the cluster pods to be rebuilt, and during this process, data loss or client request failures may occur.
Configuration example:
...\nspec:\n...\n storaged:\n # When set to true,\n # it means that the leader partition replicas are not transferred,\n # but the cluster pods are rebuilt directly.\n enableForceUpdate: true \n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/","title":"Restart service Pods in a NebulaGraph cluster on K8s","text":"Note
Restarting NebulaGraph cluster service Pods is a feature in the Alpha version.
During routine maintenance, it might be necessary to restart a specific service Pod in the NebulaGraph cluster, for instance, when the Pod's status is abnormal or to enforce a restart. Restarting a Pod essentially means restarting the service process. To ensure high availability, NebulaGraph Operator supports gracefully restarting all Pods of the Graph, Meta, or Storage service respectively and gracefully restarting an individual Pod of the Storage service.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created in a K8s environment. For details, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#restart_all_pods_of_a_certain_service_type","title":"Restart all Pods of a certain service type","text":"To gracefully roll restart all Pods of a certain service type in the cluster, you can add an annotation (nebula-graph.io/restart-timestamp
) with the current time to the configuration of the StatefulSet controller of the corresponding service.
When NebulaGraph Operator detects that the StatefulSet controller of the corresponding service has the annotation nebula-graph.io/restart-timestamp
and its value is changed, it triggers the graceful rolling restart operation for all Pods of that service type in the cluster.
In the following example, the annotation is added for all Graph services so that all Pods of these Graph services are restarted one by one.
Assume that the cluster name is nebula
and the cluster resources are in the default
namespace. Run the following command:
Check the name of the StatefulSet controller.
kubectl get statefulset \n
Sample output:
NAME READY AGE\nnebula-graphd 2/2 33s\nnebula-metad 3/3 69s\nnebula-storaged 3/3 69s\n
Get the current timestamp.
date -u +%s\n
Example output:
1700547115\n
Overwrite the timestamp annotation of the StatefulSet controller to trigger the graceful rolling restart operation.
kubectl annotate statefulset nebula-graphd nebula-graph.io/restart-timestamp=\"1700547115\" --overwrite\n
Example output:
statefulset.apps/nebula-graphd annotate\n
Observe the restart process.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=graphd -w\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-graphd-0 1/1 Running 0 9m37s\nnebula-graphd-1 0/1 Running 0 17s\nnebula-graphd-1 1/1 Running 0 20s\nnebula-graphd-0 1/1 Terminating 0 9m40s\nnebula-graphd-0 0/1 Terminating 0 9m41s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Pending 0 0s\nnebula-graphd-0 0/1 Pending 0 0s\nnebula-graphd-0 0/1 ContainerCreating 0 0s\nnebula-graphd-0 0/1 Running 0 2s\n
This above output shows the status of Graph service Pods during the restart process.
Verify that the StatefulSet controller annotation is updated.
kubectl get statefulset nebula-graphd -o yaml | grep \"nebula-graph.io/restart-timestamp\"\n
Example output:
nebula-graph.io/last-applied-configuration: '{\"persistentVolumeClaimRetentionPolicy\":{\"whenDeleted\":\"Retain\",\"whenScaled\":\"Retain\"},\"podManagementPolicy\":\"Parallel\",\"replicas\":2,\"revisionHistoryLimit\":10,\"selector\":{\"matchLabels\":{\"app.kubernetes.io/cluster\":\"nebula\",\"app.kubernetes.io/component\":\"graphd\",\"app.kubernetes.io/managed-by\":\"nebula-operator\",\"app.kubernetes.io/name\":\"nebula-graph\"}},\"serviceName\":\"nebula-graphd-headless\",\"template\":{\"metadata\":{\"annotations\":{\"nebula-graph.io/cm-hash\":\"7c55c0e5ac74e85f\",\"nebula-graph.io/restart-timestamp\":\"1700547815\"},\"creationTimestamp\":null,\"labels\":{\"app.kubernetes.io/cluster\":\"nebula\",\"app.kubernetes.io/component\":\"graphd\",\"app.kubernetes.io/managed-by\":\"nebula-operator\",\"app.kubernetes.io/name\":\"nebula-graph\"}},\"spec\":{\"containers\":[{\"command\":[\"/bin/sh\",\"-ecx\",\"exec\nnebula-graph.io/restart-timestamp: \"1700547115\"\n nebula-graph.io/restart-timestamp: \"1700547815\" \n
The above output indicates that the annotation of the StatefulSet controller has been updated, and all graph service Pods has been restarted.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#restart_a_single_storage_service_pod","title":"Restart a single Storage service Pod","text":"To gracefully roll restart a single Storage service Pod, you can add an annotation (nebula-graph.io/restart-ordinal
) with the value set to the ordinal number of the Storage service Pod you want to restart. This triggers a graceful restart or state transition for that specific Storage service Pod. The added annotation will be automatically removed after the Storage service Pod is restarted.
In the following example, the annotation is added for the Pod with ordinal number 1
, indicating a graceful restart for the nebula-storaged-1
Storage service Pod.
Assume that the cluster name is nebula
, and the cluster resources are in the default
namespace. Run the following commands:
Check the name of the StatefulSet controller.
kubectl get statefulset \n
Example output:
NAME READY AGE\nnebula-graphd 2/2 33s\nnebula-metad 3/3 69s\nnebula-storaged 3/3 69s\n
Get the ordinal number of the Storage service Pod.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=storaged\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 13h\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-2 1/1 Running 0 13h\nnebula-storaged-3 1/1 Running 0 13h\nnebula-storaged-4 1/1 Running 0 13h\nnebula-storaged-5 1/1 Running 0 13h\nnebula-storaged-6 1/1 Running 0 13h\nnebula-storaged-7 1/1 Running 0 13h\nnebula-storaged-8 1/1 Running 0 13h\n
Add the annotation for the nebula-storaged-1
Pod to trigger a graceful restart for that specific Pod.
kubectl annotate statefulset nebula-storaged nebula-graph.io/restart-ordinal=\"1\" \n
Example output:
statefulset.apps/nebula-storaged annotate\n
Observe the restart process.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=storaged -w\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 13h\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-2 1/1 Running 0 13h\nnebula-storaged-3 1/1 Running 0 13h\nnebula-storaged-4 1/1 Running 0 13h\nnebula-storaged-5 1/1 Running 0 12h\nnebula-storaged-6 1/1 Running 0 12h\nnebula-storaged-7 1/1 Running 0 12h\nnebula-storaged-8 1/1 Running 0 12h\n\n\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-1 1/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Pending 0 0s\nnebula-storaged-1 0/1 Pending 0 0s\nnebula-storaged-1 0/1 ContainerCreating 0 0s\nnebula-storaged-1 0/1 Running 0 1s\nnebula-storaged-1 1/1 Running 0 10s \n
The above output indicates that the nebula-storaged-1
Storage service Pod is successfully restarted.
After restarting a single Storage service Pod, the distribution of storage leader replicas may become unbalanced. You can execute the BALANCE LEADER
command to rebalance the distribution of leader replicas. For information about how to view the leader distribution, see SHOW HOSTS
.
Before using NebulaGraph Cloud, you need to create a subscription on Azure. This topic describes how to create a subscription on Azure Marketplace.
"},{"location":"nebula-cloud/2.how-to-create-subsciption/#subscription_workflow","title":"Subscription workflow","text":"Enter the Azure Marketplace, and search for NebulaGraph Cloud in the search bar in Marketplace, or directly click NebulaGraph Cloud to enter the subscription page. [TODO]
Select a plan according to your own needs and click Set up + subscribe.
On the Basics page of Subscribe NebulaGraph Cloud, fill in the following plan details:
Project details
Field Description Subscription Select a subscription. Resource group Select an existing resource group or create a new one.SaaS details
Field Description Name Create a name for this SaaS subscription to easily identify it later. Recurring billingOn
or Off
. At the bottom of the Basics page, click Next: Tags.
After the subscription is completed, you need to click Open the SaaS account on the publisher's website
to create and configure your Solution. For details, see How to configure a Solution.
Solution refers to the NebulaGraph database running on NebulaGraph Cloud. After subscribing NebulaGraph Cloud on Azure, you need to configure your Solutions on the Cloud platform to complete the purchase. This topic describes how to configure a Solution.
"},{"location":"nebula-cloud/3.how-to-set-solution/#configuration_workflow","title":"Configuration workflow","text":"Log in to the Azure account that has subscribed the Solution service in NebulaGraph Cloud.
Select a region in the Provider section.
Caution
The region of the database you select should be in the same area as that of your business to avoid performance and speed problems.
Configure the type and the number of the query engine and the type, the number, and the disk size of the storage engine in the Instance section.
Caution
It is recommended to configure at least 2 query engines and 3 storage engines to ensure high service availability.
Enter the specified Azure account mailbox as the Root user in the NebulaGraph section.
Click Next at the bottom of this page.
For now, you have completed the configuration of the Solution. If the status of the Solution is running on the Cloud homepage, the Solution has been created successfully.
You may see the following status and corresponding description on the Solution page.
Status Description creating The resources required by a Solution are ready and the Solution will be created automatically. At this time, the Solution is in the creating state, which may last from several minutes to over ten minutes. starting After you have restarted a Solution, it will be in the starting state for a while. stopping After you have clicked Stop Solution, the Solution will be in the stopping state for a while. deleting After you have clicked Delete Solution, the Solution will be in the deleting state for a while. running After you create a Solution, it will be in the running state for a long time. stopped After you stop a Solution, it will be in the stopped state for a long time. deleted After you delete a Solution, it will be in the deleted state for a long time. create_failed If you failed to create a Solution, the Solution will be in the create_failed state for a long time. stop_failed If you failed to stop a Solution, the Solution will be in the stop_failed state for a long time. start_failed If you failed to start a Solution, the Solution will be in the start_failed state for a long time.Caution
If a Solution stays in an intermediate state for a long time and the page remains unchanged after refreshing, it means that there is an exception and you need to submit an order to solve the problem.
Caution
If a Solution is in the state of create_failed, stop_failed, or start_failed, you can execute CREATE, STOP, or START again.
"},{"location":"nebula-cloud/4.user-role-description/","title":"Cloud Solution roles","text":"After creating a Solution, you need to confirm the role privileges in the Cloud platform. This topic introduces the role privileges in the Cloud Solution.
"},{"location":"nebula-cloud/4.user-role-description/#built-in_roles","title":"Built-in roles","text":"NebulaGraph Cloud has multiple built-in roles:
On the Solution page, users with different roles will see different sidebars. The following describes the privileges of each role. Among them, Y means that this role can view this page, and N means that it cannot.
Page OWNER ROOT USER Solution Info Y Y Y Applications Y Y Y Connectivity Y N N Root Management Y N N User Management N Y N Audit Log Y N N Settings Y N N Subscribe Settings Y N N Billing Y N N"},{"location":"nebula-cloud/7.terms-and-conditions/","title":"Terms of Service","text":"These terms and conditions (\"Agreement\") sets forth the general terms and conditions of your use of the https://cloud.nebula-cloud.io website (\"Website\" or \"Service\") and any of its related products and services (collectively, \"Services\"). This Agreement is legally binding between you (\"User\", \"you\" or \"your\") and vesoft inc. (\"vesoft inc.\", \"we\", \"us\" or \"our\"). By accessing and using the Website and Services, you acknowledge that you have read, understood, and agree to be bound by the terms of this Agreement. If you are entering into this Agreement on behalf of a business or other legal entity, you represent that you have the authority to bind such entity to this Agreement, in which case the terms \"User\", \"you\" or \"your\" shall refer to such entity. If you do not have such authority, or if you do not agree with the terms of this Agreement, you must not accept this Agreement and may not access and use the Website and Services. You acknowledge that this Agreement is a contract between you and vesoft inc., even though it is electronic and is not physically signed by you, and it governs your use of the Website and Services.
"},{"location":"nebula-cloud/7.terms-and-conditions/#accounts","title":"Accounts","text":"You give NebulaGraph Cloud permission to use your Azure account as your NebulaGraph Cloud account and get your account information so that NebulaGraph Cloud can contact you regarding this product and related products. You understand that the rights to use NebulaGraph Cloud come from vesoft instead of Microsoft, vesoft is the provider of this product. Use of NebulaGraph Cloud is governed by provider's terms of service, service-level agreement, and privacy policy.
"},{"location":"nebula-cloud/7.terms-and-conditions/#billing_and_payments","title":"Billing and payments","text":"Microsoft collects payments from you for your commercial marketplace purchases. You may pay the fees for the NebulaGraph Cloud services according to your chosen solutions. You shall pay all fees or charges to your account in accordance with the fees, charges, and billing terms in effect at the time a fee or charge is due and payable.
"},{"location":"nebula-cloud/7.terms-and-conditions/#accuracy_of_information","title":"Accuracy of information","text":"Occasionally there may be information on the Website that contains typographical errors, inaccuracies or omissions that may relate to pricing, availability, promotions and offers. We reserve the right to correct any errors, inaccuracies or omissions, and to change or update information or cancel orders if any information on the Website or Services is inaccurate at any time without prior notice (including after you have submitted your order). We undertake no obligation to update, amend or clarify information on the Website including, without limitation, pricing information, except as required by law. No specified update or refresh date applied on the Website should be taken to indicate that all information on the Website or Services has been modified or updated.
"},{"location":"nebula-cloud/7.terms-and-conditions/#data_and_content_protection","title":"Data and content protection","text":"vesoft understands and recognizes that all the data processed, stored, uploaded, downloaded, distributed, or processed through services provided by NebulaGraph Cloud is your data or content, and you fully own your data and content. Except for the implementation of your service requirements, no unauthorized use or disclosure of your data or content will be made except in the following circumstances:
a.vesoft may disclose the data or content in any legal proceeding or to a governmental body as required by Law;
b.an agreement made between you and vesoft.
You can delete or edit your data or content yourself. If you have deleted the service or data, vesoft will delete your data and will no longer retain such data in accordance with your instructions. You should operate carefully with regard to operations such as deletion or modification.
You understand and agree: when your subscription is in the Suspended state, Microsoft gives the customer a 30-day grace period before automatically canceling the subscription. After the 30-day grace period is over, the webhook will receive an Unsubscribe action. After vesoft receives a cancellation webhook call, vesoft will only continue to store your data and content (if any) within 7 days. After 7 days, vesoft will delete all your data and content, including all cached or backup copies, and will no longer retain any of them.
Once the data or content is deleted, it cannot be restored; you shall take responsibilities caused by the data being deleted. You understand and agree that vesoft has no obligation to continue to retain, export or return your data or content.
"},{"location":"nebula-cloud/7.terms-and-conditions/#links_to_other_resources","title":"Links to other resources","text":"Although the Website and Services may link to other resources (such as websites), we are not, directly or indirectly, implying any approval, association, sponsorship, endorsement, or affiliation with any linked resource, unless specifically stated herein. You acknowledge that vesoft inc. is providing these links to you only as a convenience. We are not responsible for examining or evaluating, and we do not warrant the offerings of, any businesses or individuals or the content of their resources. We do not assume any responsibility or liability for the actions, products, services, and content of any other third parties. You should carefully review the legal statements and other conditions of use of any resource which you access through a link on the Website and Services. Your linking to any other off-site resources is at your own risk.
"},{"location":"nebula-cloud/7.terms-and-conditions/#prohibited_uses","title":"Prohibited uses","text":"In addition to other terms as set forth in the Agreement, you are prohibited from using the Website and Services or Content: (a) for any unlawful purpose; (b) to solicit others to perform or participate in any unlawful acts; (c) to violate any international, federal, provincial or state regulations, rules, laws, or local ordinances; (d) to infringe upon or violate our intellectual property rights or the intellectual property rights of others; (e) to harass, abuse, insult, harm, defame, slander, disparage, intimidate, or discriminate based on gender, sexual orientation, religion, ethnicity, race, age, national origin, or disability; (f) to submit false or misleading information; (g) to upload or transmit viruses or any other type of malicious code that will or may be used in any way that will affect the functionality or operation of the Website and Services, third party products and services, or the Internet; (h) to spam, phish, pharm, pretext, spider, crawl, or scrape; (i) for any obscene or immoral purpose; or (j) to interfere with or circumvent the security features of the Website and Services, third party products and services, or the Internet. We reserve the right to terminate your use of the Website and Services for violating any of the prohibited uses.
"},{"location":"nebula-cloud/7.terms-and-conditions/#intellectual_property_rights","title":"Intellectual property rights","text":"\"Intellectual Property Rights\" means all present and future rights conferred by law or statute in or in relation to any copyright and related rights, trademarks, designs, patents, inventions, goodwill and the right to sue for passing off, rights to inventions, rights to use, and all other intellectual property rights, in each case whether registered or unregistered and including all applications and rights to apply for and be granted, rights to claim priority from, such rights and all similar or equivalent rights or forms of protection and any other results of intellectual activity which subsist or will subsist now or in the future in any part of the world. This Agreement does not transfer to you any intellectual property owned by vesoft inc. or third parties, and all rights, titles, and interests in and to such property will remain (as between the parties) solely with vesoft inc. All trademarks, service marks, graphics and logos used in connection with the Website and Services, are trademarks or registered trademarks of vesoft inc. or its licensors. Other trademarks, service marks, graphics and logos used in connection with the Website and Services may be the trademarks of other third parties. Your use of the Website and Services grants you no right or license to reproduce or otherwise use any of vesoft inc. or third party trademarks.
"},{"location":"nebula-cloud/7.terms-and-conditions/#disclaimer_of_warranty","title":"Disclaimer of warranty","text":"You agree that such Service is provided on an \"as is\" and \"as available\" basis and that your use of the Website and Services is solely at your own risk. We expressly disclaim all warranties of any kind, whether express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose and non-infringement. We make no warranty that the Services will meet your requirements, or that the Service will be uninterrupted, timely, secure, or error-free; nor do we make any warranty as to the results that may be obtained from the use of the Service or as to the accuracy or reliability of any information obtained through the Service or that defects in the Service will be corrected. You understand and agree that any material and/or data downloaded or otherwise obtained through the use of Service is done at your own discretion and risk and that you will be solely responsible for any damage or loss of data that results from the download of such material and/or data. We make no warranty regarding any goods or services purchased or obtained through the Service or any transactions entered into through the Service. No advice or information, whether oral or written, obtained by you from us or through the Service shall create any warranty not expressly made herein.
"},{"location":"nebula-cloud/7.terms-and-conditions/#limitation_of_liability","title":"Limitation of liability","text":"To the fullest extent permitted by applicable law, in no event will vesoft inc., its affiliates, directors, officers, employees, agents, suppliers or licensors be liable to any person for any indirect, incidental, special, punitive, cover or consequential damages (including, without limitation, damages for lost profits, revenue, sales, goodwill, use of content, impact on business, business interruption, loss of anticipated savings, loss of business opportunity) however caused, under any theory of liability, including, without limitation, contract, tort, warranty, breach of statutory duty, negligence or otherwise, even if the liable party has been advised as to the possibility of such damages or could have foreseen such damages.
"},{"location":"nebula-cloud/7.terms-and-conditions/#indemnification","title":"Indemnification","text":"You agree to indemnify and hold vesoft inc. and its affiliates, directors, officers, employees, agents, suppliers and licensors harmless from and against any liabilities, losses, damages or costs, including reasonable attorneys' fees, incurred in connection with or arising from any third party allegations, claims, actions, disputes, or demands asserted against any of them as a result of or relating to your Content, your use of the Website and Services or any willful misconduct on your part.
"},{"location":"nebula-cloud/7.terms-and-conditions/#severability","title":"Severability","text":"All rights and restrictions contained in this Agreement may be exercised and shall be applicable and binding only to the extent that they do not violate any applicable laws and are intended to be limited to the extent necessary so that they will not render this Agreement illegal, invalid or unenforceable. If any provision or portion of any provision of this Agreement shall be held to be illegal, invalid or unenforceable by a court of competent jurisdiction, it is the intention of the parties that the remaining provisions or portions thereof shall constitute their agreement with respect to the subject matter hereof, and all such remaining provisions or portions thereof shall remain in full force and effect.
"},{"location":"nebula-cloud/7.terms-and-conditions/#dispute_resolution","title":"Dispute resolution","text":"The formation, interpretation, and performance of this Agreement and any disputes arising out of it shall be governed by the substantive and procedural laws of China without regard to its rules on conflicts or choice of law and, to the extent applicable, the laws of China. You further consent to the territorial jurisdiction of and exclusive venue in Internet Court of Hangzhou as the legal forum for any such dispute. You hereby waive any right to a jury trial in any proceeding arising out of or related to this Agreement. The United Nations Convention on Contracts for the International Sale of Goods does not apply to this Agreement.
"},{"location":"nebula-cloud/7.terms-and-conditions/#assignment","title":"Assignment","text":"You may not assign, resell, sub-license or otherwise transfer or delegate any of your rights or obligations hereunder, in whole or in part, without our prior written consent, which consent shall be at our own sole discretion and without obligation; any such assignment or transfer shall be null and void. We are free to assign any of its rights or obligations hereunder, in whole or in part, to any third party as part of the sale of all or substantially all of its assets or stock or as part of a merger.
"},{"location":"nebula-cloud/7.terms-and-conditions/#changes_and_amendments","title":"Changes and amendments","text":"We reserve the right to modify this Agreement or its terms relating to the Website and Services at any time, effective upon posting of an updated version of this Agreement on the Website. When we do, we will revise the updated date at the bottom of this page. Continued use of the Website and Services after any such changes shall constitute your consent to such changes.
"},{"location":"nebula-cloud/7.terms-and-conditions/#acceptance_of_these_terms","title":"Acceptance of these terms","text":"You acknowledge that you have read this Agreement and agree to all its terms and conditions. By accessing and using the Website and Services you agree to be bound by this Agreement. If you do not agree to abide by the terms of this Agreement, you are not authorized to access or use the Website and Services.
"},{"location":"nebula-cloud/7.terms-and-conditions/#contacting_us","title":"Contacting us","text":"If you would like to contact us to understand more about this Agreement or wish to contact us concerning any matter relating to it, you may send an email to legal@vesoft.com
This document was last updated on December 14, 2021
"},{"location":"nebula-cloud/8.privacy-policy/","title":"Privacy Policy","text":"This privacy policy (\"Policy\") describes how the personally identifiable information (\"Personal Information\") you may provide on the https://www.nebula-cloud.io[TODO] website (\"Website\" or \"Service\") and any of its related products and services (collectively, \"Services\") is collected, protected and used. It also describes the choices available to you regarding our use of your Personal Information and how you can access and update this information. This Policy is a legally binding agreement between you (\"User\", \"you\" or \"your\") and vesoft Inc. (\"vesoft Inc.\", \"we\", \"us\" or \"our\"). By accessing and using the Website and Services, you acknowledge that you have read, understood, and agree to be bound by the terms of this Agreement. This Policy does not apply to the practices of companies that we do not own or control, or to individuals that we do not employ or manage.
"},{"location":"nebula-cloud/8.privacy-policy/#automatic_collection_of_information","title":"Automatic collection of information","text":"When you open the Website, our servers automatically record information that your browser sends. This data may include information such as your device's IP address, browser type and version, operating system type and version, language preferences or the webpage you were visiting before you came to the Website and Services, pages of the Website and Services that you visit, the time spent on those pages, information you search for on the Website, access times and dates, and other statistics.
Information collected automatically is used only to identify potential cases of abuse and establish statistical information regarding the usage and traffic of the Website and Services. This statistical information is not otherwise aggregated in such a way that would identify any particular user of the system.
"},{"location":"nebula-cloud/8.privacy-policy/#collection_of_personal_information","title":"Collection of personal information","text":"You can access and use the Website and Services without telling us who you are or revealing any information by which someone could identify you as a specific, identifiable individual. If, however, you wish to use some of the features on the Website, you may be asked to provide certain Personal Information (for example, your name and e-mail address). We receive and store any information you knowingly provide to us when you create an account or fill any online forms on the Website. When required, this information may include the following:
You can choose not to provide us with your Personal Information, but then you may not be able to take advantage of some of the features on the Website. Users who are uncertain about what information is mandatory are welcome to contact us.
"},{"location":"nebula-cloud/8.privacy-policy/#use_and_processing_of_collected_information","title":"Use and processing of collected information","text":"In order to make the Website and Services available to you, or to meet a legal obligation, we need to collect and use certain Personal Information. If you do not provide the information that we request, we may not be able to provide you with the requested products or services. Some of the information we collect is directly from you via the Website and Services. However, we may also collect Personal Information about you from other sources. Any of the information we collect from you may be used for the following purposes:
Processing your Personal Information depends on how you interact with the Website and Services, where you are located in the world and if one of the following applies: (i) you have given your consent for one or more specific purposes; (ii) provision of information is necessary for the performance of an agreement with you and/or for any pre-contractual obligations thereof; (iii) processing is necessary for compliance with a legal obligation to which you are subject; (iv) processing is related to a task that is carried out in the public interest or in the exercise of official authority vested in us; (v) processing is necessary for the purposes of the legitimate interests pursued by us or by a third party.
Note that under some legislation we may be allowed to process information until you object to such processing (by opting out), without having to rely on consent or any other of the following legal bases below. In any case, we will be happy to clarify the specific legal basis that applies to the processing, and in particular whether the provision of Personal Information is a statutory or contractual requirement, or a requirement necessary to enter into a contract.
"},{"location":"nebula-cloud/8.privacy-policy/#managing_information","title":"Managing information","text":"You are able to delete certain Personal Information we have about you. The Personal Information you can delete may change as the Website and Services change. If you would like to delete your Personal Information or permanently delete your account, you can do so by contacting us.
"},{"location":"nebula-cloud/8.privacy-policy/#disclosure_of_information","title":"Disclosure of information","text":"Depending on the requested Services or as necessary to complete any transaction or provide any service you have requested, we may share your information with your consent with our trusted third parties that work with us, any other affiliates and subsidiaries we rely upon to assist in the operation of the Website and Services available to you. We do not share Personal Information with unaffiliated third parties. These service providers are not authorized to use or disclose your information except as necessary to perform services on our behalf or comply with legal requirements. We may share your Personal Information for these purposes only with third parties whose privacy policies are consistent with ours or who agree to abide by our policies with respect to Personal Information. These third parties are given Personal Information they need only in order to perform their designated functions, and we do not authorize them to use or disclose Personal Information for their own marketing or other purposes.
We will disclose any Personal Information we collect, use or receive if required or permitted by law, such as to comply with a subpoena, or similar legal process, and when we believe in good faith that disclosure is necessary to protect our rights, protect your safety or the safety of others, investigate fraud, or respond to a government request.
In the event we go through a business transition, such as a merger or acquisition by another company, or sale of all or a portion of its assets, your user account, and Personal Information will likely be among the assets transferred.
"},{"location":"nebula-cloud/8.privacy-policy/#retention_of_information","title":"Retention of information","text":"We will retain and use your Personal Information for the period necessary to comply with our legal obligations, resolve disputes, and enforce our agreements unless a longer retention period is required or permitted by law. We may use any aggregated data derived from or incorporating your Personal Information after you update or delete it, but not in a manner that would identify you personally. Once the retention period expires, Personal Information shall be deleted. Therefore, the right to access, the right to erasure, the right to rectification and the right to data portability cannot be enforced after the expiration of the retention period.
"},{"location":"nebula-cloud/8.privacy-policy/#transfer_of_information","title":"Transfer of information","text":"Depending on your location, data transfers may involve transferring and storing your information in a country other than your own. You are entitled to learn about the legal basis of information transfers to a country outside the European Union or to any international organization governed by public international law or set up by two or more countries, such as the UN, and about the security measures taken by us to safeguard your information. If any such transfer takes place, you can find out more by checking the relevant sections of this Policy or inquire with us using the information provided in the contact section.
"},{"location":"nebula-cloud/8.privacy-policy/#the_rights_of_users","title":"The rights of users","text":"You may exercise certain rights regarding your information processed by us. In particular, you have the right to do the following: (i) you have the right to withdraw consent where you have previously given your consent to the processing of your information; (ii) you have the right to object to the processing of your information if the processing is carried out on a legal basis other than consent; (iii) you have the right to learn if information is being processed by us, obtain disclosure regarding certain aspects of the processing and obtain a copy of the information undergoing processing; (iv) you have the right to verify the accuracy of your information and ask for it to be updated or corrected; (v) you have the right, under certain circumstances, to restrict the processing of your information, in which case, we will not process your information for any purpose other than storing it; (vi) you have the right, under certain circumstances, to obtain the erasure of your Personal Information from us; (vii) you have the right to receive your information in a structured, commonly used and machine readable format and, if technically feasible, to have it transmitted to another controller without any hindrance. This provision is applicable provided that your information is processed by automated means and that the processing is based on your consent, on a contract which you are part of or on pre-contractual obligations thereof.
"},{"location":"nebula-cloud/8.privacy-policy/#the_right_to_object_to_processing","title":"The right to object to processing","text":"Where Personal Information is processed for the public interest, in the exercise of an official authority vested in us or for the purposes of the legitimate interests pursued by us, you may object to such processing by providing a ground related to your particular situation to justify the objection.
"},{"location":"nebula-cloud/8.privacy-policy/#how_to_exercise_these_rights","title":"How to exercise these rights","text":"Any requests to exercise your rights can be directed to vesoft Inc. through the contact details provided in this document. Please note that we may ask you to verify your identity before responding to such requests. Your request must provide sufficient information that allows us to verify that you are the person you are claiming to be or that you are the authorized representative of such person. You must include sufficient details to allow us to properly understand the request and respond to it. We cannot respond to your request or provide you with Personal Information unless we first verify your identity or authority to make such a request and confirm that the Personal Information relates to you.
"},{"location":"nebula-cloud/8.privacy-policy/#privacy_of_children","title":"Privacy of children","text":"We do not knowingly collect any Personal Information from children under the age of 18. If you are under the age of 18, please do not submit any Personal Information through the Website and Services. We encourage parents and legal guardians to monitor their children's Internet usage and to help enforce this Policy by instructing their children never to provide Personal Information through the Website and Services without their permission. If you have reason to believe that a child under the age of 18 has provided Personal Information to us through the Website and Services, please contact us. You must also be at least 16 years of age to consent to the processing of your Personal Information in your country (in some countries we may allow your parent or guardian to do so on your behalf).
"},{"location":"nebula-cloud/8.privacy-policy/#cookies","title":"Cookies","text":"The Website and Services use \"cookies\" to help personalize your online experience. A cookie is a text file that is placed on your hard disk by a web page server. Cookies cannot be used to run programs or deliver viruses to your computer. Cookies are uniquely assigned to you, and can only be read by a web server in the domain that issued the cookie to you. We may use cookies to collect, store, and track information for statistical purposes to operate the Website and Services. You have the ability to accept or decline cookies. Most web browsers automatically accept cookies, but you can usually modify your browser setting to decline cookies if you prefer. To learn more about cookies and how to manage them, visit internetcookies.org
"},{"location":"nebula-cloud/8.privacy-policy/#email_marketing","title":"Email marketing","text":"We offer electronic newsletters to which you may voluntarily subscribe at any time. We are committed to keeping your e-mail address confidential and will not disclose your email address to any third parties except as allowed in the information use and processing section. We will maintain the information sent via e-mail in accordance with applicable laws and regulations.
"},{"location":"nebula-cloud/8.privacy-policy/#links_to_other_resources","title":"Links to other resources","text":"The Website and Services contain links to other resources that are not owned or controlled by us. Such links do not constitute an endorsement by vesoft Inc. of those External Web Sites. Please be aware that we are not responsible for the privacy practices of such other resources or third parties. We encourage you to be aware when you leave the Website and Services and to read the privacy statements of each and every resource that may collect Personal Information. You should carefully review the legal statements and other conditions of use of any resource which you access through a link on the Website and Services.
"},{"location":"nebula-cloud/8.privacy-policy/#information_security","title":"Information security","text":"We secure information you provide on computer servers in a controlled, secure environment, protected from unauthorized access, use, or disclosure. We maintain reasonable administrative, technical, and physical safeguards in an effort to protect against unauthorized access, use, modification, and disclosure of Personal Information in its control and custody. However, no data transmission over the Internet or wireless network can be guaranteed. Therefore, while we strive to protect your Personal Information, you acknowledge that (i) there are security and privacy limitations of the Internet which are beyond our control; (ii) the security, integrity, and privacy of any and all information and data exchanged between you and the Website and Services cannot be guaranteed; and (iii) any such information and data may be viewed or tampered with in transit by a third party, despite best efforts.
"},{"location":"nebula-cloud/8.privacy-policy/#data_breach","title":"Data breach","text":"In the event we become aware that the security of the Website and Services has been compromised or users Personal Information has been disclosed to unrelated third parties as a result of external activity, including, but not limited to, security attacks or fraud, we reserve the right to take reasonably appropriate measures, including, but not limited to, investigation and reporting, as well as notification to and cooperation with law enforcement authorities. In the event of a data breach, we will make reasonable efforts to notify affected individuals if we believe that there is a reasonable risk of harm to the user as a result of the breach or if notice is otherwise required by law. When we do, we will post a notice on the Website, send you an email.
"},{"location":"nebula-cloud/8.privacy-policy/#changes_and_amendments","title":"Changes and amendments","text":"We reserve the right to modify this Policy or its terms relating to the Website and Services from time to time in our discretion and will notify you of any material changes to the way in which we treat Personal Information. When we do, we will revise the updated date at the bottom of this page. We may also provide notice to you in other ways in our discretion, such as through contact information you have provided. Any updated version of this Policy will be effective immediately upon the posting of the revised Policy unless otherwise specified. Your continued use of the Website and Services after the effective date of the revised Policy (or such other act specified at that time) will constitute your consent to those changes. However, we will not, without your consent, use your Personal Information in a manner materially different than what was stated at the time your Personal Information was collected.
"},{"location":"nebula-cloud/8.privacy-policy/#dispute_resolution","title":"Dispute resolution","text":"The formation, interpretation, and performance of this Agreement and any disputes arising out of it shall be governed by the substantive and procedural laws of China without regard to its rules on conflicts or choice of law and, to the extent applicable, the laws of China. You further consent to the personal jurisdiction of and exclusive venue in Yuhang District Court located in Hangzhou as the legal forum for any such dispute. You hereby waive any right to a jury trial in any proceeding arising out of or related to this Agreement. The United Nations Convention on Contracts for the International Sale of Goods does not apply to this Agreement.
"},{"location":"nebula-cloud/8.privacy-policy/#acceptance_of_this_policy","title":"Acceptance of this policy","text":"You acknowledge that you have read this Policy and agree to all its terms and conditions. By accessing and using the Website and Services you agree to be bound by this Policy. If you do not agree to abide by the terms of this Policy, you are not authorized to access or use the Website and Services.
"},{"location":"nebula-cloud/8.privacy-policy/#contacting_us","title":"Contacting us","text":"If you would like to contact us to understand more about this Policy or wish to contact us concerning any matter relating to individual rights and your Personal Information, you may send an email to legal@vesoft.com
This document was last updated on December 28, 2021
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/","title":"Solution","text":"On the Solution page, the sidebars are different based on roles and privileges. For more information, see Roles and privileges in Cloud.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#solution_info","title":"Solution Info","text":"On the homepage of Cloud, click on the Solution's name to enter the Solution Info page. The Solution Info page consists of the following parts: Basic Info, Instance Info, Price Info, Getting Started. You can view the information on this page in detail.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#applications","title":"Applications","text":"In the sidebar, click Applications to enter the page of ecosystem tools(Dashboard/Studio/Explorer). Different roles see different ecosystem tools. For more information, see Accessory applications.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#connectivity","title":"Connectivity","text":"In the sidebar, click Connectivity to enter Private Link page. On this page, you can create a Private Link endpoint that enables you to access NebulaGraph databases through a private IP address in a virtual network. For more information, see Private Link.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#root_management","title":"Root Management","text":"In the sidebar, click Root Management to enter the root account management page. For more information, see Role and User Management.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#user_management","title":"User Management","text":"In the sidebar, click User Management to enter the user account management page. For more information, see Role and User Management.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#audit_log","title":"Audit Log","text":"In the sidebar, click Audit Log to enter the operation history page. You can select the time period according to the operation information such as Create Solution
, Start Solution
, Stop Solution
, and filter results by operator and operation record.
In the sidebar, click Settings to enter the settings page, and you can Stop Solution
or Transfer Solution
in this page.
NebulaGraph Cloud integrates with NebulaGraph Studio, NebulaGraph Dashboard, and NebulaGraph Explorer.
On the Applications page, ecosystem tools are different based on roles and privileges. The correspondence between different roles and privileges is as follows. The first column means the tools that the role can use, Y means the role has the corresponding privileges, and N means the role has no privileges.
Tools OWNER ROOT USER Dashboard Y Y N Studio N Y Y Explorer N Y Y"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#dashboard","title":"Dashboard","text":"NebulaGraph Dashboard (Dashboard for short) is a visualization tool that monitors and manages the status of machines and services in NebulaGraph clusters.
"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#studio","title":"Studio","text":"NebulaGraph Studio (Studio in short) is a browser-based visualization tool to manage NebulaGraph. It provides you with a graphical user interface to manipulate graph schemas, import data, explore graph data, and run nGQL statements to retrieve data. With Studio, you can quickly become a graph exploration expert from scratch. For more information, see What is NebulaGraph Studio.
"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#explorer","title":"Explorer","text":"NebulaGraph Explorer (Explorer in short) is a browser-based visualization tool. It is used with the NebulaGraph core to visualize interaction with graph data. Even without any experience in a graph database, you can quickly become a graph exploration expert.
"},{"location":"nebula-cloud/5.solution/5.2.connection-configuration-and-use/","title":"Private Link","text":"You can create a Private Link endpoint in Connectivity to allow users to access NebulaGraph databases through a private IP in a virtual network, without exposing your traffic to the public internet. For more information about Private Link, see What is Azure Private Link?.
"},{"location":"nebula-cloud/5.solution/5.2.connection-configuration-and-use/#configure_private_link","title":"Configure Private Link","text":"Enter your subscription ID, click Create. The creation time takes about 2 minutes.
Note
The subscription ID on the Subscription page of Azure Portal. You can click on the [Subscriptions] (https://portal.azure.com/?l=en.en-us#blade/Microsoft_Azure_Billing/SubscriptionsBlade) page for quick access.
After the creation, you can use Alias to connect to Azure resources and create a private endpoint in Azure.
Click + add.
In the Basics section, fill in the following plan details:
Project details
Field Description Subscription Select the subscription. Resource group Select an existing resource group or create a new resource group.Instance details
Field Description Name Set the name of the private endpoint. Region Select the region.Caution
The region of the database you select should be in the same area as that of your business to avoid performance and speed problems.
At the bottom of the Basics page, click Next: Resource.
In the Resource section, fill in the following plan details:
Field Description Connection method Click Connect to an Azure resource by resource ID or alias. Resource ID or alias Set the alias. Request message Set the message, this message will be sent to the resource owner.Note
The alias is on the Connectivity page of NebulaGraph Cloud, click to copy it.
At the bottom of the Resource page, click Next: Configuration.
In the Configuration section, select the following plan details:
Networking
Field Description Virtual network Set virtual networks. Subnet Set the subnet in the selected virtual network.Note
Private DNS integration is currently not supported.
At the bottom of the Configuration page, click Next: Tags.
(optional)In the Tags section, enter Name:Values.
At the bottom of the Tags page, click Next: Review + create.
After creating the private endpoint, copy the Private IP address in Network interface to the Connectivity page in Cloud. Click the Create.
Note
Private Link Endpoint IP information is stored in the Cloud, and you can click to modify.
You can use Private link endpoint IP to connect to NebulaGraph. For more information, see Connect to NebulaGraph.
"},{"location":"nebula-cloud/5.solution/5.3.role-and-authority-management/","title":"Roles and authority management","text":"NebulaGraph Cloud roles are different from roles in NebulaGraph. For more information, see Roles in Cloud Solution.
Roles in Cloud Roles in NebulaGraph OWNER - ROOT ROOT USER ADMIN/DBA/GUEST/USER"},{"location":"nebula-cloud/5.solution/5.3.role-and-authority-management/#root_management","title":"Root Management","text":"Only users with OWNER authority can manage ROOT users.
On the Root Management page, OWNER can reset ROOT users.
Click Reset, enter the email address of the ROOT user to be updated, and click Send Email to send the email.
After the ROOT user receives the confirmation email, click Confirm.
Only users with ROOT authority can manage USER users.
On the User Management page, the ROOT user can grant roles in graph spaces to other users. Available roles are ADMIN, DBA, GUEST, and USER.
NebulaGraph Dashboard Community Edition (Dashboard for short) is a visualization tool that monitors the status of machines and services in NebulaGraph clusters.
Enterpriseonly
Dashboard Enterprise Edition adds features such as visual cluster creation, batch import of clusters, fast scaling, etc. For more information, see Pricing.
"},{"location":"nebula-dashboard/1.what-is-dashboard/#features","title":"Features","text":"Dashboard monitors:
You can use Dashboard in one of the following scenarios:
The monitoring data will be retained for 14 days by default, that is, only the monitoring data within the last 14 days can be queried.
Note
The monitoring service is supported by Prometheus. The update frequency and retention intervals can be modified. For details, see Prometheus.
"},{"location":"nebula-dashboard/1.what-is-dashboard/#version_compatibility","title":"Version compatibility","text":"The version correspondence between NebulaGraph and Dashboard Community Edition is as follows.
NebulaGraph version Dashboard version 3.6.0 3.4.0 3.5.x 3.4.0 3.4.0 ~ 3.4.1 3.4.0\u30013.2.0 3.3.0 3.2.0 2.5.0 ~ 3.2.0 3.1.0 2.5.x ~ 3.1.0 1.1.1 2.0.1~2.5.1 1.0.2 2.0.1~2.5.1 1.0.1"},{"location":"nebula-dashboard/1.what-is-dashboard/#release_note","title":"Release note","text":"Release
"},{"location":"nebula-dashboard/2.deploy-dashboard/","title":"Deploy Dashboard Community Edition","text":"This topic will describe how to deploy NebulaGraph Dashboard in detail.
To download and compile the latest source code of Dashboard, follow the instructions on the nebula dashboard GitHub page.
"},{"location":"nebula-dashboard/2.deploy-dashboard/#prerequisites","title":"Prerequisites","text":"Before you deploy Dashboard, you must confirm that:
Before the installation starts, the following ports are not occupied.
Download the tar packagenebula-dashboard-3.4.0.x86_64.tar.gz as needed.
Run tar -xvf nebula-dashboard-3.4.0.x86_64.tar.gz
to decompress the installation package.
Modify the config.yaml
file in nebula-dashboard
.
The configuration file contains the configurations of four dependent services and configurations of clusters. The descriptions of the dependent services are as follows.
Service Default port Description nebula-http-gateway 8090 Provides HTTP ports for cluster services to execute nGQL statements to interact with the NebulaGraph database. nebula-stats-exporter 9200 Collects the performance metrics in the cluster, including the IP addresses, versions, and monitoring metrics (such as the number of queries, the latency of queries, the latency of heartbeats, and so on). node-exporter 9100 Collects the source information of nodes in the cluster, including the CPU, memory, load, disk, and network. prometheus 9090 The time series database that stores monitoring data.The descriptions of the configuration file are as follows.
port: 7003 # Web service port.\ngateway:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n port: 8090\n https: false # Whether to enable HTTPS.\n runmode: dev # Program running mode, including dev, test, and prod. It is used to distinguish between different running environments generally.\nstats-exporter:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n nebulaPort: 9200\n https: false # Whether to enable HTTPS.\nnode-exporter:\n - ip: nebulaHostIP_1 # The IP of the machine where the NebulaGraph is deployed.\n port: 9100\n https: false # Whether to enable HTTPS.\n# - ip: nebulaHostIP_2\n# port: 9100\n# https: false\nprometheus:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n prometheusPort: 9090\n https: false # Whether to enable HTTPS.\n scrape_interval: 5s # The interval for collecting the monitoring data, which is 1 minute by default.\n evaluation_interval: 5s # The interval for running alert rules, which is 1 minute by default.\n# Cluster node info\nnebula-cluster:\n name: 'default' # Cluster name\n metad:\n - name: metad0\n endpointIP: nebulaMetadIP # The IP of the machine where the Meta service is deployed.\n port: 9559\n endpointPort: 19559\n # - name: metad1\n # endpointIP: nebulaMetadIP\n # port: 9559\n # endpointPort: 19559 \n graphd:\n - name: graphd0\n endpointIP: nebulaGraphdIP # The IP of the machine where the Graph service is deployed.\n port: 9669\n endpointPort: 19669\n # - name: graphd1\n # endpointIP: nebulaGraphdIP\n # port: 9669\n # endpointPort: 19669 \n storaged:\n - name: storaged0\n endpointIP: nebulaStoragedIP # The IP of the machine where the Storage service is deployed.\n port: 9779\n endpointPort: 19779\n # - name: storaged1\n # endpointIP: nebulaStoragedIP\n # port: 9779\n # endpointPort: 19779 \n
Run ./dashboard.service start all
to start the services.
If you are deploying Dashboard using docker, you should also modify the configuration file config.yaml
, and then run docker-compose up -d
to start the container.
Note
If you change the port number in config.yaml
, the port number in docker-compose.yaml
needs to be consistent as well.
Run docker-compose stop
to stop the container.
You can use the dashboard.service
script to start, restart, stop, and check the Dashboard services.
sudo <dashboard_path>/dashboard.service\n[-v] [-h]\n<start|restart|stop|status> <prometheus|webserver|exporter|gateway|all>\n
Parameter Description dashboard_path
Dashboard installation path. -v
Display detailed debugging information. -h
Display help information. start
Start the target services. restart
Restart the target services. stop
Stop the target services. status
Check the status of the target services. prometheus
Set the prometheus service as the target service. webserver
Set the webserver Service as the target service. exporter
Set the exporter Service as the target service. gateway
Set the gateway Service as the target service. all
Set all the Dashboard services as the target services. Note
To view the Dashboard version, run the command ./dashboard.service -version
.
Connect to Dashboard
"},{"location":"nebula-dashboard/3.connect-dashboard/","title":"Connect Dashboard","text":"After Dashboard is deployed, you can log in and use Dashboard on the browser.
"},{"location":"nebula-dashboard/3.connect-dashboard/#prerequisites","title":"Prerequisites","text":"Confirm the IP address of the machine where the Dashboard service is installed. Enter <IP>:7003
in the browser to open the login page.
Enter the username and the passwords of the NebulaGraph database.
root
as the username and random characters as the password.To enable authentication, see Authentication.
Select the NebulaGraph version to be used.
Click Login.
NebulaGraph Dashboard consists of three parts: Machine, Service, and Management. This topic will describe them in detail.
"},{"location":"nebula-dashboard/4.use-dashboard/#overview","title":"Overview","text":""},{"location":"nebula-dashboard/4.use-dashboard/#machine","title":"Machine","text":"Click Machine->Overview to enter the machine overview page.
On this page, you can view the variation of CPU, Memory, Load, Disk, and Network In/Out quickly.
To view the detailed monitoring information, click the button. In this example, select Load
for details. The figure is as follows.
Click Service->Overview to enter the service overview page.
On this page, you can view the information of Graph, Meta, and Storage services quickly. In the upper right corner, the number of normal services and abnormal services will be displayed.
Note
In the Service page, only two monitoring metrics can be set for each service, which can be adjusted by clicking the Set up button.
To view the detailed monitoring information, click the button. In this example, select Graph
for details. The figure is as follows.
Note
Before using graph space metrics, you need to set enable_space_level_metrics
to true
in the Graph service. For details, see [Graph Service configuration](../5.configurations-and-logs/1.configurations/3.graph-config.md.
Space-level metric incompatibility
If a graph space name contains special characters, the corresponding metric data of that graph space may not be displayed.
The service monitoring page can also monitor graph space level metrics. Only when the behavior of a graph space metric is triggered, you can specify the graph space to view information about the corresponding graph space metric.
Space graph metrics record the information of different graph spaces separately. Currently, only the Graph service supports a set of space-level metrics.
For information about the space graph metrics, see Graph space.
"},{"location":"nebula-dashboard/4.use-dashboard/#management","title":"Management","text":""},{"location":"nebula-dashboard/4.use-dashboard/#overview_info","title":"Overview info","text":"On the Overview Info page, you can see the information of the NebulaGraph cluster, including Storage leader distribution, Storage service details, versions and hosts information of each NebulaGraph service, and partition distribution and details.
"},{"location":"nebula-dashboard/4.use-dashboard/#storage_leader_distribution","title":"Storage Leader Distribution","text":"In this section, the number of Leaders and the Leader distribution will be shown.
In this section, the version and host information of each NebulaGraph service will be shown. Click Detail in the upper right corner to view the details of the version and host information.
"},{"location":"nebula-dashboard/4.use-dashboard/#service_information","title":"Service information","text":"In this section, the information on Storage services will be shown. The parameter description is as follows:
Parameter DescriptionHost
The IP address of the host. Port
The port of the host. Status
The host status. Git Info Sha
The commit ID of the current version. Leader Count
The number of Leaders. Partition Distribution
The distribution of partitions. Leader Distribution
The distribution of Leaders. Click Detail in the upper right corner to view the details of the Storage service information.
"},{"location":"nebula-dashboard/4.use-dashboard/#partition_distribution","title":"Partition Distribution","text":"Select the specified graph space in the upper left corner, you can view the distribution of partitions in the specified graph space. You can see the IP addresses and ports of all Storage services in the cluster, and the number of partitions in each Storage service.
Click Detail in the upper right corner to view more details.
"},{"location":"nebula-dashboard/4.use-dashboard/#partition_information","title":"Partition information","text":"In this section, the information on partitions will be shown. Before viewing the partition information, you need to select a graph space in the upper left corner. The parameter description is as follows:
Parameter DescriptionPartition ID
The ID of the partition. Leader
The IP address and port of the leader. Peers
The IP addresses and ports of all the replicas. Losts
The IP addresses and ports of faulty replicas. Click Detail in the upper right corner to view details. You can also enter the partition ID into the input box in the upper right corner of the details page to filter the shown data.
"},{"location":"nebula-dashboard/4.use-dashboard/#config","title":"Config","text":"It shows the configuration of the NebulaGraph service. NebulaGraph Dashboard Community Edition does not support online modification of configurations for now.
"},{"location":"nebula-dashboard/4.use-dashboard/#others","title":"Others","text":"In the lower left corner of the page, you can:
This topic will describe the monitoring metrics in NebulaGraph Dashboard.
"},{"location":"nebula-dashboard/6.monitor-parameter/#machine","title":"Machine","text":"Note
cpu_utilization
The percentage of used CPU. cpu_idle
The percentage of idled CPU. cpu_wait
The percentage of CPU waiting for IO operations. cpu_user
The percentage of CPU used by users. cpu_system
The percentage of CPU used by the system."},{"location":"nebula-dashboard/6.monitor-parameter/#memory","title":"Memory","text":"Parameter Description memory_utilization
The percentage of used memory. memory_used
The memory space used (not including caches). memory_free
The memory space available."},{"location":"nebula-dashboard/6.monitor-parameter/#load","title":"Load","text":"Parameter Description load_1m
The average load of the system in the last 1 minute. load_5m
The average load of the system in the last 5 minutes. load_15m
The average load of the system in the last 15 minutes."},{"location":"nebula-dashboard/6.monitor-parameter/#disk","title":"Disk","text":"Parameter Description disk_used_percentage
The disk utilization percentage. disk_used
The disk space used. disk_free
The disk space available. disk_readbytes
The number of bytes that the system reads in the disk per second. disk_writebytes
The number of bytes that the system writes in the disk per second. disk_readiops
The number of read queries that the disk receives per second. disk_writeiops
The number of write queries that the disk receives per second. inode_utilization
The percentage of used inode."},{"location":"nebula-dashboard/6.monitor-parameter/#network","title":"Network","text":"Parameter Description network_in_rate
The number of bytes that the network card receives per second. network_out_rate
The number of bytes that the network card sends out per second. network_in_errs
The number of wrong bytes that the network card receives per second. network_out_errs
The number of wrong bytes that the network card sends out per second. network_in_packets
The number of data packages that the network card receives per second. network_out_packets
The number of data packages that the network card sends out per second."},{"location":"nebula-dashboard/6.monitor-parameter/#service","title":"Service","text":""},{"location":"nebula-dashboard/6.monitor-parameter/#period","title":"Period","text":"The period is the time range of counting metrics. It currently supports 5 seconds, 60 seconds, 600 seconds, and 3600 seconds, which respectively represent the last 5 seconds, the last 1 minute, the last 10 minutes, and the last 1 hour.
"},{"location":"nebula-dashboard/6.monitor-parameter/#metric_methods","title":"Metric methods","text":"Parameter Descriptionrate
The average rate of operations per second in a period. sum
The sum of operations in the period. avg
The average latency in the cycle. P75
The 75th percentile latency. P95
The 95th percentile latency. P99
The 99th percentile latency. P999
The 99.9th percentile latency. Note
Dashboard collects the following metrics from the NebulaGraph core, but only shows the metrics that are important to it.
"},{"location":"nebula-dashboard/6.monitor-parameter/#graph","title":"Graph","text":"Parameter Descriptionnum_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"nebula-dashboard/6.monitor-parameter/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"nebula-dashboard/6.monitor-parameter/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"nebula-dashboard/6.monitor-parameter/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"nebula-studio/system-settings/","title":"Global settings","text":"This topic introduces the global settings of NebulaGraph Studio, including language switching and beta functions.
Beta functions: Switch on/off beta features, which include view schema, text to query and AI import.
The text to query and AI import features need to be configured with AI-related configurations. See below for detailed configurations.
The text to query and AI import are artificial intelligence features developed based on the large language model (LLM) and require the following parameters to be configured.
Parameter Description API type The API type for AI. Valid values areOpenAI
and Aliyun
. URL The API URL. Fill in the correct URL format according to the corresponding API type. For example, https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/chat/completions?api-version={api-version}
\u3002 Key The key used to validate the API. The key is required when using an online large language model, and is optional depending on the actual settings when using an offline large language model. Model The version of the large language model. The model is required when using an online large language model, and is optional depending on the actual settings when using an offline large language model. Max text length The maximum length for receiving or generating a single piece of text. Unit: byte."},{"location":"nebula-studio/about-studio/st-ug-limitations/","title":"Limitations","text":"This topic introduces the limitations of Studio.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#architecture","title":"Architecture","text":"For now, Studio v3.x supports x86_64 architecture only.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#upload_data","title":"Upload data","text":"Only CSV files without headers can be uploaded, but no limitations are applied to the size and store period for a single file. The maximum data volume depends on the storage capacity of your machine.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#data_backup","title":"Data backup","text":"For now, only supports exporting query results in CSV format on Console, and other data backup methods are not supported.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#ngql_statements","title":"nGQL statements","text":"On the Console page of Docker-based and RPM-based Studio v3.x, all the nGQL syntaxes except these are supported:
USE <space_name>
: You cannot run such a statement on the Console page to choose a graph space. As an alternative, you can click a graph space name in the drop-down list of Current Graph Space.We recommend that you use the latest version of Chrome to get access to Studio. Otherwise, some features may not work properly.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/","title":"What is NebulaGraph Studio","text":"NebulaGraph Studio (Studio in short) is a browser-based visualization tool to manage NebulaGraph. It provides you with a graphical user interface to manipulate graph schemas, import data, and run nGQL statements to retrieve data. With Studio, you can quickly become a graph exploration expert from scratch. You can view the latest source code in the NebulaGraph GitHub repository, see nebula-studio for details.
Note
You can also try some functions online in Studio.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#deployment","title":"Deployment","text":"In addition to deploying Studio with RPM-based, DEB-based, or Tar-based packages, or with Docker, you can also deploy Studio with Helm in the Kubernetes cluster. For more information, see Deploy Studio.
The functions of the above four deployment methods are the same and may be restricted when using Studio. For more information, see Limitations.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#features","title":"Features","text":"Studio can easily manage NebulaGraph data, with the following functions:
You can use Studio in one of these scenarios:
Authentication is not enabled in NebulaGraph by default. Users can log into Studio with the root
account and any password.
When NebulaGraph enables authentication, users can only sign into Studio with the specified account. For more information, see Authentication.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#version_compatibility","title":"Version compatibility","text":"Note
The Studio version is released independently of the NebulaGraph core. The correspondence between the versions of Studio and the NebulaGraph core, as shown in the table below.
NebulaGraph version Studio version 3.6.0 3.8.0, 3.7.0 3.5.0 3.7.0 3.4.0 ~ 3.4.1 3.7.0\u30013.6.0\u30013.5.1\u30013.5.0 3.3.0 3.5.1\u30013.5.0 3.0.0 \uff5e 3.2.0 3.4.1\u30013.4.0 3.1.0 3.3.2 3.0.0 3.2.x 2.6.x 3.1.x 2.6.x 3.1.x 2.0 & 2.0.1 2.x 1.x 1.x"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#check_updates","title":"Check updates","text":"Studio is in development. Users can view the latest releases features through Changelog.
To view the Changelog, on the upper-right corner of the page, click the version and then New version.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/","title":"Connect to NebulaGraph","text":"After successfully launching Studio, you need to configure to connect to NebulaGraph. This topic describes how Studio connects to the NebulaGraph database.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#prerequisites","title":"Prerequisites","text":"Before connecting to the NebulaGraph database, you need to confirm the following information:
9669
. To connect Studio to NebulaGraph, follow these steps:
Type http://<ip_address>:7001
in the address bar of your browser.
The following login page shows that Studio starts successfully.
On the Config Server page of Studio, configure these fields:
Graphd IP address: Enter the IP address of the Graph service of NebulaGraph. For example, 192.168.10.100
.
Note
127.0.0.1
or localhost
.9669
.Username and Password: Fill in the log in account according to the authentication settings of NebulaGraph.
root
and any password as the username and its password.root
and nebula
as the username and its password.After the configuration, click the Connect button.
Note
One session continues for up to 30 minutes. If you do not operate Studio within 30 minutes, the active session will time out and you must connect to NebulaGraph again.
A welcome page is displayed on the first login, showing the relevant functions according to the usage process, and the test datasets can be automatically downloaded and imported.
To visit the welcome page, click .
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#next_to_do","title":"Next to do","text":"When Studio is successfully connected to NebulaGraph, you can do these operations:
Note
The permissions of an account determine the operations that can be performed. For details, see Roles and privileges.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#log_out","title":"Log out","text":"If you want to reconnect to NebulaGraph, you can log out and reconfigure the database.
Click the user profile picture in the upper right corner, and choose Log out.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/","title":"Deploy Studio","text":"This topic describes how to deploy Studio locally by RPM, DEB, tar package and Docker.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#rpm-based_studio","title":"RPM-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites","title":"Prerequisites","text":"Before you deploy RPM-based Studio, you must confirm that:
lsof
.Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by Studio.Select and download the RPM package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Checksum NebulaGraph version nebula-graph-studio-3.9.0.x86_64.rpm nebula-graph-studio-3.9.0.x86_64.rpm.sha256 masterUse sudo rpm -i <rpm_name>
to install RPM package.
For example, install Studio 3.9.0, use the following command. The default installation path is /usr/local/nebula-graph-studio
.
$ sudo rpm -i nebula-graph-studio-3.9.0.x86_64.rpm\n
You can also install it to the specified path using the following command:
$ sudo rpm -i nebula-graph-studio-3.9.0.x86_64.rpm --prefix=<path> \n
When the screen returns the following message, it means that the PRM-based Studio has been successfully started.
Start installing NebulaGraph Studio now...\nNebulaGraph Studio has been installed.\nNebulaGraph Studio started automatically.\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can uninstall Studio using the following command:
$ sudo rpm -e nebula-graph-studio-3.9.0.x86_64\n
If these lines are returned, PRM-based Studio has been uninstalled.
NebulaGraph Studio removed, bye~\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#exception_handling","title":"Exception handling","text":"If the automatic start fails during the installation process or you want to manually start or stop the service, use the following command:
$ bash /usr/local/nebula-graph-studio/scripts/rpm/start.sh\n
$ bash /usr/local/nebula-graph-studio/scripts/rpm/stop.sh\n
If you encounter an error bind EADDRINUSE 0.0.0.0:7001
when starting the service, you can use the following command to check port 7001 usage.
$ lsof -i:7001\n
If the port is occupied and the process on that port cannot be terminated, you can modify the startup port within the studio configuration and restart the service.
//Modify the studio service configuration. The default path to the configuration file is `/usr/local/nebula-graph-studio`.\n$ vi etc/studio-api.yam\n\n//Modify this port number and change it to any \nPort: 7001\n\n//Restart service\n$ systemctl restart nebula-graph-studio.service\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#deb-based_studio","title":"DEB-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_1","title":"Prerequisites","text":"Before you deploy DEB-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by Studio/usr/lib/systemd/system
exists in the system. If not, create it manually.Select and download the DEB package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Checksum NebulaGraph version nebula-graph-studio-3.9.0.x86_64.deb nebula-graph-studio-3.9.0.x86_64.deb.sha256 masterUse sudo dpkg -i <deb_name>
to install DEB package.
For example, install Studio 3.9.0, use the following command:
$ sudo dpkg -i nebula-graph-studio-3.9.0.x86_64.deb\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can uninstall Studio using the following command:
$ sudo dpkg -r nebula-graph-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#tar-based_studio","title":"tar-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_2","title":"Prerequisites","text":"Before you deploy tar-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by StudioSelect and download the tar package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Studio version nebula-graph-studio-3.9.0.x86_64.tar.gz 3.9.0Use tar -xvf
to decompress the tar package.
$ tar -xvf nebula-graph-studio-3.9.0.x86_64.tar.gz\n
Deploy and start nebula-graph-studio.
$ cd nebula-graph-studio\n$ ./server\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can use kill pid
to stop the service:
$ kill $(lsof -t -i :7001) #stop nebula-graph-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#docker-based_studio","title":"Docker-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_3","title":"Prerequisites","text":"Before you deploy Docker-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by StudioTo deploy and start Docker-based Studio, run the following commands. Here we use NebulaGraph vmaster for demonstration:
Download the configuration files for the deployment.
Installation package NebulaGraph version nebula-graph-studio-3.9.0.tar.gz masterCreate the nebula-graph-studio-3.9.0
directory and decompress the installation package to the directory.
$ mkdir nebula-graph-studio-3.9.0 -zxvf nebula-graph-studio-3.9.0.gz -C nebula-graph-studio-3.9.0\n
Change to the nebula-graph-studio-3.9.0
directory.
$ cd nebula-graph-studio-3.9.0\n
Pull the Docker image of Studio.
$ docker-compose pull\n
Build and start Docker-based Studio. In this command, -d
is to run the containers in the background.
$ docker-compose up -d\n
If these lines are returned, Docker-based Studio v3.x is deployed and started.
Creating docker_web_1 ... done\n
When Docker-based Studio is started, use http://<ip address>:7001
to get access to Studio.
Note
Run ifconfig
or ipconfig
to get the IP address of the machine where Docker-based Studio is running. On the machine running Docker-based Studio, you can use http://localhost:7001
to get access to Studio.
If you can see the Config Server page on the browser, Docker-based Studio is started successfully.
This section describes how to deploy Studio with Helm.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_4","title":"Prerequisites","text":"Before installing Studio, you need to install the following software and ensure the correct version of the software:
Software Requirement Kubernetes >= 1.14 Helm >= 3.2.0"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#install_2","title":"Install","text":"Use Git to clone the source code of Studio to the host.
$ git clone https://github.com/vesoft-inc/nebula-studio.git\n
Make the nebula-studio
directory the current working directory.
bash $ cd nebula-studio
Assume using release name:my-studio
, installed Studio in Helm Chart.
$ helm upgrade --install my-studio --set service.type=NodePort --set service.port=30070 deployment/helm\n
The configuration parameters of the Helm Chart are described below.
Parameter Default value Description replicaCount 0 The number of replicas for Deployment. image.nebulaStudio.name vesoft/nebula-graph-studio The image name of nebula-graph-studio. image.nebulaStudio.version v3.9.0 The image version of nebula-graph-studio. service.type ClusterIP The service type, which should be one ofNodePort
, ClusterIP
, and LoadBalancer
. service.port 7001 The expose port for nebula-graph-studio's web. service.nodePort 32701 The proxy port for accessing nebula-studio outside kubernetes cluster. resources.nebulaStudio {} The resource limits/requests for nebula-studio. persistent.storageClassName \"\" The name of storageClass. The default value will be used if not specified. persistent.size 5Gi The persistent volume size. When Studio is started, use http://<node_address>:30070/
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
$ helm uninstall my-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#next_to_do","title":"Next to do","text":"On the Config Server page, connect Docker-based Studio to NebulaGraph. For more information, see Connect to NebulaGraph.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/","title":"Manage edge types","text":"After a graph space is created in NebulaGraph, you can create edge types. With Studio, you can choose to use the Console page or the Schema page to create, retrieve, update, or delete edge types. This topic introduces how to use the Schema page to operate edge types in a graph space only.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/#prerequisites","title":"Prerequisites","text":"To operate an edge type on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab and click the + Create button.
On the Create Edge Type page, do these settings:
serve
is used.Define Properties (Optional): If necessary, click + Add Property to do these settings:
TTL_COL
and TTL_ DURATION
(in seconds). For more information about both parameters, see TTL configuration.When the preceding settings are completed, in the Equivalent to the following nGQL statement panel, you can see the nGQL statement equivalent to these settings.
Confirm the settings and then click the + Create button.
When the edge type is created successfully, the Define Properties panel shows all its properties on the list.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/#edit_an_edge_type","title":"Edit an edge type","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab, find an edge type and then click the button in the Operations column.
On the Edit page, do these operations:
Comment
.To edit the TTL configuration: On the Set TTL panel, click Edit and then change the configuration of TTL_COL
and TTL_DURATION
(in seconds).
Note
For information about the coexistence problem of TTL and index, see TTL.
Danger
Confirm the impact before deleting the Edge type. The deleted data cannot be restored if it is not backup.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab, find an edge type and then click the button in the Operations column.
Click OK to confirm in the pop-up dialog box.
After the edge type is created, you can use the Console page to insert edge data one by one manually or use the Import page to bulk import edge data.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/","title":"Manage indexes","text":"You can create an index for a Tag and/or an Edge type. An index lets traversal start from vertices or edges with the same property and it can make a query more efficient. With Studio, you can use the Console page or the Schema page to create, retrieve, and delete indexes. This topic introduces how to use the Schema page to operate an index only.
Note
You can create an index when a Tag or an Edge Type is created. But an index can decrease the write speed during data import. We recommend that you import data firstly and then create and rebuild an index. For more information, see Index overview.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/#prerequisites","title":"Prerequisites","text":"To operate an index on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab and then click the + Create button.
On the Create page, do these settings:
Indexed Properties (Optional): Click Add property, and then, in the dialog box, choose a property. If necessary, repeat this step to choose more properties. You can drag the properties to sort them. In this example, degree
is chosen.
Note
The order of the indexed properties has an effect on the result of the LOOKUP
statement. For more information, see nGQL Manual.
When the settings are done, the Equivalent to the following nGQL statement panel shows the statement equivalent to the settings.
Confirm the settings and then click the + Create button. When an index is created, the index list shows the new index.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, in the upper left corner, choose an index type, Tag or Edge Type.
In the list, find an index and click its row. All its details are shown in the expanded row.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, in the upper left corner, choose an index type, Tag or Edge Type.
Click the Index tab, find an index and then click the button Rebuild in the Operations column.
Note
For more Information, see REBUILD INDEX.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/#delete_an_index","title":"Delete an index","text":"To delete an index on Schema, follow these steps:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, find an index and then click the button in the Operations column.
Click OK to confirm in the pop-up dialog box.
When Studio is connected to NebulaGraph, you can create or delete a graph space. You can use the Console page or the Schema page to do these operations. This article only introduces how to use the Schema page to operate graph spaces in NebulaGraph.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-space/#prerequisites","title":"Prerequisites","text":"To operate a graph space on the Schema page of Studio, you must do a check of these:
root
and any password to sign in to Studio.root
and its password to sign in to Studio.In the toolbar, click the Schema tab.
In the Graph Space List page, click Create Space, do these settings:
basketballplayer
is used. The name must be unique in the database.FIXED_STRING(<N>)
or INT64
. A graph space can only select one VID type. In this example, FIXED_STRING(32)
is used. For more information, see VID.Statistics of basketball players
is used.partition_num
and replica_factor
respectively. In this example, these parameters are set to 100
and 1
respectively. For more information, see CREATE SPACE
syntax.In the Equivalent to the following nGQL statement panel, you can see the statement equivalent to the preceding settings.
CREATE SPACE basketballplayer (partition_num = 100, replica_factor = 1, vid_type = FIXED_STRING(32)) COMMENT = \"Statistics of basketball players\"\n
Confirm the settings and then click the + Create button. If the graph space is created successfully, you can see it on the graph space list.
Danger
Deleting the space will delete all the data in it, and the deleted data cannot be restored if it is not backed up.
In the toolbar, click the Schema tab.
In the Graph Space List, find the space you want to be deleted, and click Delete Graph Space in the Operation column.
On the dialog box, confirm the information and then click OK.
After a graph space is created, you can create or edit a schema, including:
After a graph space is created in NebulaGraph, you can create tags. With Studio, you can use the Console page or the Schema page to create, retrieve, update, or delete tags. This topic introduces how to use the Schema page to operate tags in a graph space only.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-tag/#prerequisites","title":"Prerequisites","text":"To operate a tag on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab and click the + Create button.
On the Create page, do these settings:
course
is specified.Define Properties (Optional): If necessary, click + Add Property to do these settings:
TTL_COL
and TTL_ DURATION
(in seconds). For more information about both parameters, see TTL configuration.When the preceding settings are completed, in the Equivalent to the following nGQL statement panel, you can see the nGQL statement equivalent to these settings.
Confirm the settings and then click the + Create button.
When the tag is created successfully, the Define Properties panel shows all its properties on the list.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-tag/#edit_a_tag","title":"Edit a tag","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab, find a tag and then click the button in the Operations column.
On the Edit page, do these operations:
Comment
.To edit the TTL configuration: On the Set TTL panel, click Edit and then change the configuration of TTL_COL
and TTL_DURATION
(in seconds).
Note
For the problem of the coexistence of TTL and index, see TTL.
Danger
Confirm the impact before deleting the tag. The deleted data cannot be restored if it is not backup.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab, find an tag and then click the button in the Operations column.
Click OK to confirm delete a tag in the pop-up dialog box.
After the tag is created, you can use the Console page to insert vertex data one by one manually or use the Import page to bulk import vertex data.
"},{"location":"nebula-studio/manage-schema/st-ug-view-schema/","title":"View Schema","text":"Users can visually view schemas in NebulaGraph Studio.
"},{"location":"nebula-studio/manage-schema/st-ug-view-schema/#steps","title":"Steps","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
Click View Schema tab and click the Get Schema button.
In the Graph Space List page, find a graph space and then perform the following operations in the Operations column:
Studio supports the schema drafting function. Users can design their schemas on the canvas to visually display the relationships between vertices and edges, and apply the schema to a specified graph space after the design is completed.
"},{"location":"nebula-studio/quick-start/draft/#features","title":"Features","text":"At the top navigation bar, click .
"},{"location":"nebula-studio/quick-start/draft/#design_schema","title":"Design schema","text":"The following steps take designing the schema of the basketballplayer
dataset as an example to demonstrate how to use the schema drafting function.
player
, and add two properties name
and age
.team
, and the property is name
.player
to the anchor point of the tag team
. Click the generated edge, fill in the name of the edge type as serve
, and add two properties start_year
and end_year
.player
to another one of its own. Click the generated edge, fill in the name of the edge type as follow
, and add a property degree
.Import the schema to a new or existing space, and click Confirm.
Note
Select the schema draft that you want to modify from the Draft list on the left side of the page. Click at the upper right corner after the modification.
Note
The graph space to which the schema has been applied will not be modified synchronously.
"},{"location":"nebula-studio/quick-start/draft/#delete_schema","title":"Delete schema","text":"Select the schema draft that you want to delete from the Draft list on the left side of the page, click X at the upper right corner of the thumbnail, and confirm to delete it.
"},{"location":"nebula-studio/quick-start/draft/#export_schema","title":"Export Schema","text":"Click at the upper right corner to export the schema as a PNG image.
"},{"location":"nebula-studio/quick-start/st-ug-console/","title":"Console","text":"Studio console interface is shown as follows.
"},{"location":"nebula-studio/quick-start/st-ug-console/#entry","title":"Entry","text":"In the top navigation bar, click Console.
"},{"location":"nebula-studio/quick-start/st-ug-console/#overview","title":"Overview","text":"The following table lists the functions on the console page.
number function descriptions 1 View the schema Display the schemas of the graph spaces. 2 Select a space Select a space in the graph space drop down list. The console does not support using theUSE <space_name>
statement to switch graph spaces. 3 Favorites Click the button to expand the favorites. Select a statement, and it automatically populates the input box. 4 History list Click the button to view the execution history. In the execution history list, click one of the statements, and the statement is automatically populates the input box. The list provides the record of the last 15 statements.Type /
in the input box to quickly select a historical query statement. 5 Clean input box Click the button to clear the content populated in the input box. 6 Run After entering the nGQL statement in the input box, click the button to start running the statement. 7 Input box The area where the nGQL statement is entered. The statement displays different colors depending on the schemas or character strings. Code auto-completion is supported. You can quickly enter a tag or edge type based on the schema.You can input multiple statements and run them at the same time by using the separator ;
. Use the symbol //
to add comments.Support right-clicking on a selected statement and then performing operations such as cut, copy, or run. 8 Custom parameters display Click the button to expand the custom parameters for the parameterized query. For details, see Manage parameters. 9 Statement running status After running the nGQL statement, the statement running status is displayed. If the statement runs successfully, the statement is displayed in green. If the statement fails, the statement is displayed in red. 10 Add to favorites Click the button to save the statement as a favorite. The button for the favorite statement is colored in yellow. 11 Export CSV file or PNG file After running the nGQL statement to return the result, when the result is in the Table window, click the button to export as a CSV file. Switch to the Graph window and click the button to export the results as a CSV file or a PNG image. 12 Expand/hide execution results Click the button to hide the result or click to expand the result. 13 Close execution results Click the button to close the result returned by this nGQL statement. 14 Table window Display the results returned by the nGQL statement in a table. 15 Plan window Display the execution plan. If an EXPLAIN
or PROFILE
statement is executed, the window presents the execution plan in visual form. See the description of the execution plan below. 16 Graph window Display the results returned by the nGQL statement in a graph if the results contain complete vertex and edge information. Click the button on the right to view the overview panel. 17 AI Assistant You can chat with an AI assistant to convert natural language instructions into nGQL query statements and then copy the nGQL statements into the input box with one click. This feature needs to be set up and enabled in the system settings before use.Note: The schema information of the current graph space is sent to the large language model when you chat with the assistant. Please pay attention to information security.You can click the text2match toggle to switch between general Q&A and query Q&A. The query Q&A can convert the natural language instructions to nGQL query statements."},{"location":"nebula-studio/quick-start/st-ug-console/#execution_plan_descriptions","title":"Execution plan descriptions","text":"The Studio can display the execution plan of the statement. The execution plan descriptions are as follows.
No. Description 1 AnEXPLAIN
or PROFILE
statement. 2 The operators used by the execution plan, which are sorted according to the execution duration. The top three operators are labeled as red, orange, and yellow, respectively. Clicking on an operator directly selects the corresponding operator in the operator execution flow and displays the operator information.Note: The PROFILE
statement actually executes the statement, and the actual execution durations can be obtained and sorted. The EXPLAIN
statement does not execute the statement, and all operators are considered to have the same execution duration and are all labeled as red. 3 The operator execution flow. For each operator, the following information is displayed: in-parameters, out-parameters, and total execution duration.The Select
, Loop
, PassThrough
, and Start
operators have independent color schemes.The arrows show the direction of data flow and the number of rows. The thicker the arrows, the more rows of data. You can click on the operator to check the details of the operator on the right side. 4 The details of the operator, divided into Profiling data
and Operator info
.Profiling data
shows the performance data of the operator, including the rows of data received, the execution time, the total time, etc.Operator info
shows the detailed operation information of the operator. 5 Zoom out, zoom in, or reverse the execution flow. 6 The duration of the statement. 7 Full screen or cancel full screen."},{"location":"nebula-studio/quick-start/st-ug-create-schema/","title":"Create a schema","text":"To batch import data into NebulaGraph, you must have a graph schema. You can create a schema on the Console page or on the Schema page of Studio.
Note
To create a graph schema on Studio, you must do a check of these:
Note
If no graph space exists and your account has the GOD privilege, you can create a graph space on the Console page. For more information, see CREATE SPACE.
"},{"location":"nebula-studio/quick-start/st-ug-create-schema/#create_a_schema_with_schema","title":"Create a schema with Schema","text":"Create tags. For more information, see Operate tags.
Create edge types. For more information, see Operate edge types.
In the toolbar, click the Console tab.
In the Current Graph Space field, choose a graph space name. In this example, basketballplayer is used.
In the input box, enter these statements one by one and click the button Run.
// To create a tag named \"player\", with two property\nnebula> CREATE TAG player(name string, age int);\n\n// To create a tag named \"team\", with one property\nnebula> CREATE TAG team(name string);\n\n// To create an edge type named \"follow\", with one properties\nnebula> CREATE EDGE follow(degree int);\n\n// To create an edge type named \"serve\", with two properties\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
If the preceding statements are executed successfully, the schema is created. You can run the statements as follows to view the schema.
// To list all the tags in the current graph space\nnebula> SHOW TAGS;\n\n// To list all the edge types in the current graph space\nnebula> SHOW EDGES;\n\n// To view the definition of the tags and edge types\nDESCRIBE TAG player;\nDESCRIBE TAG team;\nDESCRIBE EDGE follow;\nDESCRIBE EDGE serve;\n
If the schema is created successfully, in the result window, you can see the definition of the tags and edge types.
"},{"location":"nebula-studio/quick-start/st-ug-create-schema/#next_to_do","title":"Next to do","text":"When a schema is created, you can import data.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/","title":"Import data","text":"Studio supports importing data in CSV format into NebulaGraph through an interface.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#prerequisites","title":"Prerequisites","text":"To batch import data, do a check of these:
In the top navigation bar, click Import.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#steps","title":"Steps","text":"Importing data is divided into 2 parts, creating a new data source and creating an import task, which will be described in detail next.
Note
You can also import tasks via the AI Import feature, which is a beta feature that needs to be enabled and configured in the system settings before use.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#create_a_new_data_source","title":"Create a new data source","text":"Click New Data Source in the upper right corner of the page to set the data source and its related settings. Currently, 3 types of data sources are supported.
Type of data source Description Cloud storage Add cloud storage as the CSV file source, which only supports cloud services compatible with the Amazon S3 interface. SFTP Add SFTP as the CSV file source. Local file Upload a local CSV file. The file size can not exceed 200 MB, please put the files exceeding the limit into other types of data sources.Note
Click New Import at the top left corner of the page to complete the following settings:
Caution
Users can also click Import Template to download the sample configuration file example.yaml
, configure it and then upload the configuration file. Configure in the same way as NebulaGraph Importer.
Map Tags:
NULL
or have DEFAULT
set, you can leave the corresponding column unspecified.After completing the settings, click Import, enter the password for the NebulaGraph account, and confirm.
After the import task is created, you can view the progress of the import task in the Import Data tab, which supports operations such as filtering tasks based on graph space, editing the task, viewing logs, downloading logs, reimporting, downloading configuration files, and deleting tasks.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#import_data_using_ai_import","title":"Import data using AI Import","text":"Note
After the import task is completed, check whether the data is imported successfully. If not, it is recommended that you check the task logs on the import page to see whether issues such as timeouts, privacy policy violations, service interruption, or encoding errors occurred.
Click AI Import in the upper left corner of the page to complete the following settings:
You can view the LLM
parameters related to AI import in the configuration file.
After completing the settings, click Next to confirm the file for import and the AI URL to be used, and then click Start.
After the import task is created, you can view the progress of the import task on the Import Data tab, which supports operations such as viewing logs, downloading logs, reimporting, and deleting tasks.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#next","title":"Next","text":"After completing the data import, users can access the Console page.
"},{"location":"nebula-studio/quick-start/st-ug-plan-schema/","title":"Design a schema","text":"To manipulate graph data in NebulaGraph with Studio, you must have a graph schema. This article introduces how to design a graph schema for NebulaGraph.
A graph schema for NebulaGraph must have these essential elements:
In this article, you can install the sample data set basketballplayer and use it to explore a pre-designed schema.
This table gives all the essential elements of the schema.
Element Name Property name (Data type) Description Tag player -name
(string
) - age
(int
) Represents the player. Tag team - name
(string
) Represents the team. Edge type serve - start_year
(int
) - end_year
(int
) Represent the players behavior.This behavior connects the player to the team, and the direction is from player to team. Edge type follow - degree
(int
) Represent the players behavior.This behavior connects the player to the player, and the direction is from a player to a player. This figure shows the relationship (serve/follow) between a player and a team.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/","title":"Connecting to the database error","text":""},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#problem_description","title":"Problem description","text":"According to the connect Studio operation, it prompts failed.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#possible_causes_and_solutions","title":"Possible causes and solutions","text":"You can troubleshoot the problem by following the steps below.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#step1_confirm_that_the_format_of_the_host_field_is_correct","title":"Step1: Confirm that the format of the Host field is correct","text":"You must fill in the IP address (graph_server_ip
) and port of the NebulaGraph database Graph service. If no changes are made, the port defaults to 9669
. Even if NebulaGraph and Studio are deployed on the current machine, you must use the local IP address instead of 127.0.0.1
, localhost
or 0.0.0.0
.
If authentication is not enabled, you can use root and any password as the username and its password.
If authentication is enabled and different users are created and assigned roles, users in different roles log in with their accounts and passwords.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#step3_confirm_that_nebulagraph_service_is_normal","title":"Step3: Confirm that NebulaGraph service is normal","text":"Check NebulaGraph service status. Regarding the operation of viewing services:
If the NebulaGraph service is normal, proceed to Step 4 to continue troubleshooting. Otherwise, please restart NebulaGraph service.
Note
If you used docker-compose up -d
to satrt NebulaGraph before, you must run the docker-compose down
to stop NebulaGraph.
Run a command (for example, telnet 9669) on the Studio machine to confirm whether NebulaGraph's Graph service network connection is normal.
If the connection fails, check according to the following steps:
If you cannot connect to the NebulaGraph service after troubleshooting with the above steps, please go to the NebulaGraph forum for consultation.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/","title":"Cannot access to Studio","text":""},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#problem_description","title":"Problem description","text":"I follow the document description and visit 127.0.0.1:7001
or 0.0.0.0:7001
after starting Studio, why can\u2019t I open the page?
You can troubleshoot the problem by following the steps below.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step1_confirm_system_architecture","title":"Step1: Confirm system architecture","text":"It is necessary to confirm whether the machine where the Studio service is deployed is of x86_64 architecture. Currently, Studio only supports x86_64 architecture.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step2_check_if_the_studio_service_starts_normally","title":"Step2: Check if the Studio service starts normally","text":"systemctl status nebula-graph-studio
to see the running status.sudo lsof -i:7001
to check port status.For Studio deployed with docker, use docker-compose ps
to see the running status. Run docker-compose ps
to check if the service has started normally.
If the service is normal, the return result is as follows. Among them, the State
column should all be displayed as Up
.
Name Command State Ports\n------------------------------------------------------------------------------------------------------\nnebula-web-docker_client_1 ./nebula-go-api Up 0.0.0.0:32782->8080/tcp\nnebula-web-docker_importer_1 nebula-importer --port=569 ... Up 0.0.0.0:32783->5699/tcp\nnebula-web-docker_nginx_1 /docker-entrypoint.sh ngin ... Up 0.0.0.0:7001->7001/tcp, 80/tcp\nnebula-web-docker_web_1 docker-entrypoint.sh npm r ... Up 0.0.0.0:32784->7001/tcp\n
If the above result is not returned, stop Studio and restart it first. For details, refer to Deploy Studio.
!!! note
If you used `docker-compose up -d` to satrt NebulaGraph before, you must run the `docker-compose down` to stop NebulaGraph.\n
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step3_confirm_address","title":"Step3: Confirm address","text":"If Studio and the browser are on the same machine, users can use localhost:7001
, 127.0.0.1:7001
or 0.0.0.0:7001
in the browser to access Studio.
If Studio and the browser are not on the same machine, you must enter <studio_server_ip>:7001
in the browser. Among them, studio_server_ip
refers to the IP address of the machine where the Studio service is deployed.
Run curl <studio_server_ip>:7001
-I to confirm if it is normal. If it returns HTTP/1.1 200 OK
, it means that the network is connected normally.
If the connection is refused, check according to the following steps:
If the connection fails, check according to the following steps:
If you cannot connect to the NebulaGraph service after troubleshooting with the above steps, please go to the NebulaGraph forum for consultation.
"},{"location":"nebula-studio/troubleshooting/st-ug-faq/","title":"FAQ","text":"Why can't I use a function?
If you find that a function cannot be used, it is recommended to troubleshoot the problem according to the following steps:
Confirm that NebulaGraph is the latest version. If you use Docker Compose to deploy the NebulaGraph database, it is recommended to run docker-compose pull && docker-compose up -d
to pull the latest Docker image and start the container.
Confirm that Studio is the latest version. For more information, refer to check updates.
Search the nebula forum, nebula and nebula-studio projects on the GitHub to confirm if there are already similar problems.
If none of the above steps solve the problem, you can submit a problem on the forum.
num_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"reuse/source-monitoring-metrics/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"reuse/source-monitoring-metrics/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"reuse/source-monitoring-metrics/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"reuse/source_connect-to-nebula-graph/","title":"Source connect to nebula graph","text":"This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"reuse/source_connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"reuse/source_install-nebula-graph-by-rpm-or-deb/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
NebulaGraph supports managing services with scripts.
"},{"location":"reuse/source_manage-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"reuse/source_manage-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"reuse/source_manage-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"reuse/source_manage-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"synchronization-and-migration/2.balance-syntax/","title":"BALANCE syntax","text":"We can submit tasks to load balance Storage services in NebulaGraph. For more information about storage load balancing and examples, see Storage load balance.
Note
For other job management commands, see Job manager and the JOB statements.
The syntax for load balance is described as follows.
Syntax DescriptionSUBMIT JOB BALANCE LEADER
Starts a job to balance the distribution of all the storage leaders in all graph spaces. It returns the job ID. For details about how to view, stop, and restart a job, see Job manager and the JOB statements.
"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to NebulaGraph master Documentation","text":"Note
This manual is revised on 2024-4-26, with GitHub commit 924040ba96.
NebulaGraph is a distributed, scalable, and lightning-fast graph database. It is the optimal solution in the world capable of hosting graphs with dozens of billions of vertices (nodes) and trillions of edges (relationships) with millisecond latency.
"},{"location":"#getting_started","title":"Getting started","text":"Note
Additional information or operation-related notes.
Caution
May have adverse effects, such as causing performance degradation or triggering known minor problems.
Warning
May lead to serious issues, such as data loss or system crash.
Danger
May lead to extremely serious issues, such as system damage or information leakage.
Compatibility
The compatibility notes between nGQL and openCypher, or between the current version of nGQL and its prior ones.
Enterpriseonly
Differences between the NebulaGraph Community and Enterprise editions.
"},{"location":"#modify_errors","title":"Modify errors","text":"This NebulaGraph manual is written in the Markdown language. Users can click the pencil sign on the upper right side of each document title and modify errors.
"},{"location":"nebula-bench/","title":"NebulaGraph Bench","text":"NebulaGraph Bench is a performance test tool for NebulaGraph using the LDBC data set.
"},{"location":"nebula-bench/#scenario","title":"Scenario","text":"Release
"},{"location":"nebula-bench/#test_process","title":"Test process","text":"For detailed usage instructions, see NebulaGraph Bench.
"},{"location":"nebula-console/","title":"NebulaGraph Console","text":"NebulaGraph Console is a native CLI client for NebulaGraph. It can be used to connect a NebulaGraph cluster and execute queries. It also supports special commands to manage parameters, export query results, import test datasets, etc.
"},{"location":"nebula-console/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"nebula-console/#obtain_nebulagraph_console","title":"Obtain NebulaGraph Console","text":"You can obtain NebulaGraph Console in the following ways:
To connect to NebulaGraph with the nebula-console
file, use the following syntax:
<path_of_console> -addr <ip> -port <port> -u <username> -p <password>\n
path_of_console
indicates the storage path of the NebulaGraph Console binary file.For example:
Direct link to NebulaGraph
./nebula-console -addr 192.168.8.100 -port 9669 -u root -p nebula\n
Enable SSL encryption and require two-way authentication
./nebula-console -addr 192.168.8.100 -port 9669 -u root -p nebula -enable_ssl -ssl_root_ca_path /home/xxx/cert/root.crt -ssl_cert_path /home/xxx/cert/client.crt -ssl_private_key_path /home/xxx/cert/client.key\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP or hostname of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. -ssl_insecure_skip_verify
Specifies whether the client skips verifying the server's certificate chain and hostname. The default is false
. If set to true
, any certificate chain and hostname provided by the server is accepted. For information on more parameters, see the project repository.
"},{"location":"nebula-console/#manage_parameters","title":"Manage parameters","text":"You can save parameters for parameterized queries.
Note
SAMPLE
clauses.The command to save a parameter is as follows:
nebula> :param <param_name> => <param_value>;\n
The example is as follows:
nebula> :param p1 => \"Tim Duncan\";\nnebula> MATCH (v:player{name:$p1})-[:follow]->(n) RETURN v,n;\n+----------------------------------------------------+-------------------------------------------------------+\n| v | n |\n+----------------------------------------------------+-------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+----------------------------------------------------+-------------------------------------------------------+\nnebula> :param p2 => {\"a\":3,\"b\":false,\"c\":\"Tim Duncan\"};\nnebula> RETURN $p2.b AS b;\n+-------+\n| b |\n+-------+\n| false |\n+-------+\n
The command to view the saved parameters is as follows:
nebula> :params;\n
The command to view the specified parameters is as follows:
nebula> :params <param_name>;\n
The command to delete a specified parameter is as follows:
nebula> :param <param_name> =>;\n
Export query results, which can be saved as a CSV file, DOT file, and a format of Profile or Explain.
Note
pwd
shows.The command to export a csv file is as follows:
nebula> :CSV <file_name.csv>;\n
The command to export a DOT file is as follows:
nebula> :dot <file_name.dot>\n
The example is as follows:
nebula> :dot a.dot\nnebula> PROFILE FORMAT=\"dot\" GO FROM \"player100\" OVER follow;\n
The command to export a PROFILE or EXPLAIN format is as follows:
nebula> :profile <file_name>;\n
or nebula> :explain <file_name>;\n
Note
The text file output by the above command is the preferred way to report issues in GitHub and execution plans in forums, and for graph query tuning because it has more information and is more readable than a screenshot or CSV file in Studio.
The example is as follows:
nebula> :profile profile.log\nnebula> PROFILE GO FROM \"player102\" OVER serve YIELD dst(edge);\nnebula> :profile profile.dot\nnebula> PROFILE FORMAT=\"dot\" GO FROM \"player102\" OVER serve YIELD dst(edge);\nnebula> :explain explain.log\nnebula> EXPLAIN GO FROM \"player102\" OVER serve YIELD dst(edge);\n
The testing dataset is named basketballplayer
. To view details about the schema and data, use the corresponding SHOW
command.
The command to import a testing dataset is as follows:
nebula> :play basketballplayer\n
"},{"location":"nebula-console/#run_a_command_multiple_times","title":"Run a command multiple times","text":"To run a command multiple times, use the following command:
nebula> :repeat N\n
The example is as follows:
nebula> :repeat 3\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 2602/3214 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 583/849 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\nGot 2 rows (time spent 496/671 us)\n\nFri, 20 Aug 2021 06:36:05 UTC\n\nExecuted 3 times, (total time spent 3681/4734 us), (average time spent 1227/1578 us)\n
"},{"location":"nebula-console/#sleep","title":"Sleep","text":"This command will make NebulaGraph Console sleep for N seconds. The schema is altered in an async way and takes effect in the next heartbeat cycle. Therefore, this command is usually used when altering schema. The command is as follows:
nebula> :sleep N\n
"},{"location":"nebula-console/#disconnect_nebulagraph_console_from_nebulagraph","title":"Disconnect NebulaGraph Console from NebulaGraph","text":"You can use :EXIT
or :QUIT
to disconnect from NebulaGraph. For convenience, NebulaGraph Console supports using these commands in lower case without the colon (\":\"), such as quit
.
The example is as follows:
nebula> :QUIT\n\nBye root!\n
"},{"location":"1.introduction/1.what-is-nebula-graph/","title":"What is NebulaGraph","text":"NebulaGraph is an open-source, distributed, easily scalable, and native graph database. It is capable of hosting graphs with hundreds of billions of vertices and trillions of edges, and serving queries with millisecond-latency.
"},{"location":"1.introduction/1.what-is-nebula-graph/#what_is_a_graph_database","title":"What is a graph database","text":"A graph database, such as NebulaGraph, is a database that specializes in storing vast graph networks and retrieving information from them. It efficiently stores data as vertices (nodes) and edges (relationships) in labeled property graphs. Properties can be attached to both vertices and edges. Each vertex can have one or multiple tags (labels).
Graph databases are well suited for storing most kinds of data models abstracted from reality. Things are connected in almost all fields in the world. Modeling systems like relational databases extract the relationships between entities and squeeze them into table columns alone, with their types and properties stored in other columns or even other tables. This makes data management time-consuming and cost-ineffective.
NebulaGraph, as a typical native graph database, allows you to store the rich relationships as edges with edge types and properties directly attached to them.
"},{"location":"1.introduction/1.what-is-nebula-graph/#advantages_of_nebulagraph","title":"Advantages of NebulaGraph","text":""},{"location":"1.introduction/1.what-is-nebula-graph/#open_source","title":"Open source","text":"NebulaGraph is open under the Apache 2.0 License. More and more people such as database developers, data scientists, security experts, and algorithm engineers are participating in the designing and development of NebulaGraph. To join the opening of source code and ideas, surf the NebulaGraph GitHub page.
"},{"location":"1.introduction/1.what-is-nebula-graph/#outstanding_performance","title":"Outstanding performance","text":"Written in C++ and born for graphs, NebulaGraph handles graph queries in milliseconds. Among most databases, NebulaGraph shows superior performance in providing graph data services. The larger the data size, the greater the superiority of NebulaGraph.For more information, see NebulaGraph benchmarking.
"},{"location":"1.introduction/1.what-is-nebula-graph/#high_scalability","title":"High scalability","text":"NebulaGraph is designed in a shared-nothing architecture and supports scaling in and out without interrupting the database service.
"},{"location":"1.introduction/1.what-is-nebula-graph/#developer_friendly","title":"Developer friendly","text":"NebulaGraph supports clients in popular programming languages like Java, Python, C++, and Go, and more are under development. For more information, see NebulaGraph clients.
"},{"location":"1.introduction/1.what-is-nebula-graph/#reliable_access_control","title":"Reliable access control","text":"NebulaGraph supports strict role-based access control and external authentication servers such as LDAP (Lightweight Directory Access Protocol) servers to enhance data security. For more information, see Authentication and authorization.
"},{"location":"1.introduction/1.what-is-nebula-graph/#diversified_ecosystem","title":"Diversified ecosystem","text":"More and more native tools of NebulaGraph have been released, such as NebulaGraph Studio, NebulaGraph Console, and NebulaGraph Exchange. For more ecosystem tools, see Ecosystem tools overview.
Besides, NebulaGraph has the ability to be integrated with many cutting-edge technologies, such as Spark, Flink, and HBase, for the purpose of mutual strengthening in a world of increasing challenges and chances.
"},{"location":"1.introduction/1.what-is-nebula-graph/#opencypher-compatible_query_language","title":"OpenCypher-compatible query language","text":"The native NebulaGraph Query Language, also known as nGQL, is a declarative, openCypher-compatible textual query language. It is easy to understand and easy to use. For more information, see nGQL guide.
"},{"location":"1.introduction/1.what-is-nebula-graph/#future-oriented_hardware_with_balanced_reading_and_writing","title":"Future-oriented hardware with balanced reading and writing","text":"Solid-state drives have extremely high performance and they are getting cheaper. NebulaGraph is a product based on SSD. Compared with products based on HDD and large memory, it is more suitable for future hardware trends and easier to achieve balanced reading and writing.
"},{"location":"1.introduction/1.what-is-nebula-graph/#easy_data_modeling_and_high_flexibility","title":"Easy data modeling and high flexibility","text":"You can easily model the connected data into NebulaGraph for your business without forcing them into a structure such as a relational table, and properties can be added, updated, and deleted freely. For more information, see Data modeling.
"},{"location":"1.introduction/1.what-is-nebula-graph/#high_popularity","title":"High popularity","text":"NebulaGraph is being used by tech leaders such as Tencent, Vivo, Meituan, and JD Digits. For more information, visit the NebulaGraph official website.
"},{"location":"1.introduction/1.what-is-nebula-graph/#use_cases","title":"Use cases","text":"NebulaGraph can be used to support various graph-based scenarios. To spare the time spent on pushing the kinds of data mentioned in this section into relational databases and on bothering with join queries, use NebulaGraph.
"},{"location":"1.introduction/1.what-is-nebula-graph/#fraud_detection","title":"Fraud detection","text":"Financial institutions have to traverse countless transactions to piece together potential crimes and understand how combinations of transactions and devices might be related to a single fraud scheme. This kind of scenario can be modeled in graphs, and with the help of NebulaGraph, fraud rings and other sophisticated scams can be easily detected.
"},{"location":"1.introduction/1.what-is-nebula-graph/#real-time_recommendation","title":"Real-time recommendation","text":"NebulaGraph offers the ability to instantly process the real-time information produced by a visitor and make accurate recommendations on articles, videos, products, and services.
"},{"location":"1.introduction/1.what-is-nebula-graph/#intelligent_question-answer_system","title":"Intelligent question-answer system","text":"Natural languages can be transformed into knowledge graphs and stored in NebulaGraph. A question organized in a natural language can be resolved by a semantic parser in an intelligent question-answer system and re-organized. Then, possible answers to the question can be retrieved from the knowledge graph and provided to the one who asked the question.
"},{"location":"1.introduction/1.what-is-nebula-graph/#social_networking","title":"Social networking","text":"Information on people and their relationships is typical graph data. NebulaGraph can easily handle the social networking information of billions of people and trillions of relationships, and provide lightning-fast queries for friend recommendations and job promotions in the case of massive concurrency.
"},{"location":"1.introduction/1.what-is-nebula-graph/#related_links","title":"Related links","text":"In graph theory, a path in a graph is a finite or infinite sequence of edges which joins a sequence of vertices. Paths are fundamental concepts of graph theory.
Paths can be categorized into 3 types: walk
, trail
, and path
. For more information, see Wikipedia.
The following figure is an example for a brief introduction.
"},{"location":"1.introduction/2.1.path/#walk","title":"Walk","text":"A walk
is a finite or infinite sequence of edges. Both vertices and edges can be repeatedly visited in graph traversal.
In the above figure C, D, and E form a cycle. So, this figure contains infinite paths, such as A->B->C->D->E
, A->B->C->D->E->C
, and A->B->C->D->E->C->D
.
Note
GO
statements use walk
.
A trail
is a finite sequence of edges. Only vertices can be repeatedly visited in graph traversal. The Seven Bridges of K\u00f6nigsberg is a typical trail
.
In the above figure, edges cannot be repeatedly visited. So, this figure contains finite paths. The longest path in this figure consists of 5 edges: A->B->C->D->E->C
.
Note
MATCH
, FIND PATH
, and GET SUBGRAPH
statements use trail
.
There are two special cases of trail, cycle
and circuit
. The following figure is an example for a brief introduction.
cycle
A cycle
refers to a closed trail
. Only the terminal vertices can be repeatedly visited. The longest path in this figure consists of 3 edges: A->B->C->A
or C->D->E->C
.
circuit
A circuit
refers to a closed trail
. Edges cannot be repeatedly visited in graph traversal. Apart from the terminal vertices, other vertices can also be repeatedly visited. The longest path in this figure: A->B->C->D->E->C->A
.
A path
is a finite sequence of edges. Neither vertices nor edges can be repeatedly visited in graph traversal.
So, the above figure contains finite paths. The longest path in this figure consists of 4 edges: A->B->C->D->E
.
A data model is a model that organizes data and specifies how they are related to one another. This topic describes the Nebula\u00a0Graph data model and provides suggestions for data modeling with NebulaGraph.
"},{"location":"1.introduction/2.data-model/#data_structures","title":"Data structures","text":"NebulaGraph data model uses six data structures to store data. They are graph spaces, vertices, edges, tags, edge types and properties.
In NebulaGraph, vertices are identified with vertex identifiers (i.e. VID
). The VID
must be unique in the same graph space. VID should be int64, or fixed_string(N).
Compatibility
In NebulaGraph 2.x a vertex must have at least one tag. And in NebulaGraph master, a tag is not required for a vertex.
->
identifies the directions of edges. Edges can be traversed in either direction.<a source vertex, an edge type, a rank value, and a destination vertex>
. Edges have no EID.Note
Tags and Edge types are similar to \"vertex tables\" and \"edge tables\" in the relational databases.
"},{"location":"1.introduction/2.data-model/#directed_property_graph","title":"Directed property graph","text":"NebulaGraph stores data in directed property graphs. A directed property graph has a set of vertices connected by directed edges. Both vertices and edges can have properties. A directed property graph is represented as:
G = < V, E, PV, PE >
The following table is an example of the structure of the basketball player dataset. We have two types of vertices, that is player and team, and two types of edges, that is serve and follow.
Element Name Property name (Data type) Description Tag player name (string) age (int) Represents players in the team. The propertiesname
and age
indicate the name and age. Tag team name (string) Represents the teams. The property name
indicates the team name. Edge type serve start_year (int) end_year (int) Represents the action of a player serving a team. The action links the player to the team, and the direction is from the player to the team.The properties start_year
and end_year
indicate the start year and end year of the service respectively. Edge type follow degree (int) Represents the action of a player following another player on Twitter. The action links one player to the other player, and the direction is from one player to the other player.The property degree
indicates the rating on how well the follower liked the followee. Note
NebulaGraph supports only directed edges.
Compatibility
NebulaGraph master allows dangling edges. Therefore, when adding or deleting, you need to ensure the corresponding source vertex and destination vertex of an edge exist. For details, see INSERT VERTEX, DELETE VERTEX, INSERT EDGE, and DELETE EDGE.
The MERGE statement in openCypher is not supported.
"},{"location":"1.introduction/3.vid/","title":"VID","text":"In a graph space, a vertex is uniquely identified by its ID, which is called a VID or a Vertex ID.
"},{"location":"1.introduction/3.vid/#features","title":"Features","text":"FIXED_STRING(<N>)
or INT64
. One graph space can only select one VID type.Vertices with the same VID will be identified as the same one. For example:
INSERT
statements (neither uses a parameter of IF NOT EXISTS
) with the same VID and tag are operated at the same time, the latter INSERT
will overwrite the former.INSERT
statements with the same VID but different tags, like TAG A
and TAG B
, are operated at the same time, the operation of Tag A
will not affect Tag B
.INT64
while NebulaGraph 2.x supports INT64
and FIXED_STRING(<N>)
. In CREATE SPACE
, VID types can be set via vid_type
.id()
function can be used to specify or locate a VID.LOOKUP
or MATCH
statements can be used to find a VID via property index.DELETE xxx WHERE id(xxx) == \"player100\"
or GO FROM \"player100\"
. Finding VIDs via properties and then operating the graph will cause poor performance, such as LOOKUP | GO FROM $-.ids
, which will run both LOOKUP
and |
one more time.VIDs can be generated via applications. Here are some tips:
N
of FIXED_STRING(<N>)
too much. Otherwise, it will occupy a lot of memory and hard disks, and slow down performance. Generate VIDs via BASE64, MD5, hash by encoding and splicing.The data type of a VID must be defined when you create the graph space. Once defined, it cannot be modified.
A VID is set when you insert a vertex and cannot be modified.
"},{"location":"1.introduction/3.vid/#query_start_vid_and_global_scan","title":"Querystart vid
and global scan","text":"In most cases, the execution plan of query statements in NebulaGraph (MATCH
, GO
, and LOOKUP
) must query the start vid
in a certain way.
There are only two ways to locate start vid
:
For example, GO FROM \"player100\" OVER
explicitly indicates in the statement that start vid
is \"player100\".
For example, LOOKUP ON player WHERE player.name == \"Tony Parker\"
or MATCH (v:player {name:\"Tony Parker\"})
locates start vid
by the index of the property player.name
.
NebulaGraph consists of three services: the Graph Service, the Storage Service, and the Meta Service. It applies the separation of storage and computing architecture.
Each service has its executable binaries and processes launched from the binaries. Users can deploy a NebulaGraph cluster on a single machine or multiple machines using these binaries.
The following figure shows the architecture of a typical NebulaGraph cluster.
"},{"location":"1.introduction/3.nebula-graph-architecture/1.architecture-overview/#the_meta_service","title":"The Meta Service","text":"The Meta Service in the NebulaGraph architecture is run by the nebula-metad processes. It is responsible for metadata management, such as schema operations, cluster administration, and user privilege management.
For details on the Meta Service, see Meta Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/1.architecture-overview/#the_graph_service_and_the_storage_service","title":"The Graph Service and the Storage Service","text":"NebulaGraph applies the separation of storage and computing architecture. The Graph Service is responsible for querying. The Storage Service is responsible for storage. They are run by different processes, i.e., nebula-graphd and nebula-storaged. The benefits of the separation of storage and computing architecture are as follows:
The separated structure makes both the Graph Service and the Storage Service flexible and easy to scale in or out.
If part of the Graph Service fails, the data stored by the Storage Service suffers no loss. And if the rest part of the Graph Service is still able to serve the clients, service recovery can be performed quickly, even unfelt by the users.
The separation of storage and computing architecture provides a higher resource utilization rate, and it enables clients to manage the cost flexibly according to business demands.
With the ability to run separately, the Graph Service may work with multiple types of storage engines, and the Storage Service may also serve more types of computing engines.
For details on the Graph Service and the Storage Service, see Graph Service and Storage Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/","title":"Meta Service","text":"This topic introduces the architecture and functions of the Meta Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#the_architecture_of_the_meta_service","title":"The architecture of the Meta Service","text":"The architecture of the Meta Service is as follows:
The Meta Service is run by nebula-metad processes. Users can deploy nebula-metad processes according to the scenario:
All the nebula-metad processes form a Raft-based cluster, with one process as the leader and the others as the followers.
The leader is elected by the majorities and only the leader can provide service to the clients or other components of NebulaGraph. The followers will be run in a standby way and each has a data replication of the leader. Once the leader fails, one of the followers will be elected as the new leader.
Note
The data of the leader and the followers will keep consistent through Raft. Thus the breakdown and election of the leader will not cause data inconsistency. For more information on Raft, see Storage service architecture.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#functions_of_the_meta_service","title":"Functions of the Meta Service","text":""},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_user_accounts","title":"Manages user accounts","text":"The Meta Service stores the information of user accounts and the privileges granted to the accounts. When the clients send queries to the Meta Service through an account, the Meta Service checks the account information and whether the account has the right privileges to execute the queries or not.
For more information on NebulaGraph access control, see Authentication.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_partitions","title":"Manages partitions","text":"The Meta Service stores and manages the locations of the storage partitions and helps balance the partitions.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_graph_spaces","title":"Manages graph spaces","text":"NebulaGraph supports multiple graph spaces. Data stored in different graph spaces are securely isolated. The Meta Service stores the metadata of all graph spaces and tracks the changes of them, such as adding or dropping a graph space.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_schema_information","title":"Manages schema information","text":"NebulaGraph is a strong-typed graph database. Its schema contains tags (i.e., the vertex types), edge types, tag properties, and edge type properties.
The Meta Service stores the schema information. Besides, it performs the addition, modification, and deletion of the schema, and logs the versions of them.
For more information on NebulaGraph schema, see Data model.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_ttl_information","title":"Manages TTL information","text":"The Meta Service stores the definition of TTL (Time to Live) options which are used to control data expiration. The Storage Service takes care of the expiring and evicting processes. For more information, see TTL.
"},{"location":"1.introduction/3.nebula-graph-architecture/2.meta-service/#manages_jobs","title":"Manages jobs","text":"The Job Management module in the Meta Service is responsible for the creation, queuing, querying, and deletion of jobs.
"},{"location":"1.introduction/3.nebula-graph-architecture/3.graph-service/","title":"Graph Service","text":"The Graph Service is used to process the query. It has four submodules: Parser, Validator, Planner, and Executor. This topic will describe the Graph Service accordingly.
"},{"location":"1.introduction/3.nebula-graph-architecture/3.graph-service/#the_architecture_of_the_graph_service","title":"The architecture of the Graph Service","text":"After a query is sent to the Graph Service, it will be processed by the following four submodules:
Parser: Performs lexical analysis and syntax analysis.
Validator: Validates the statements.
Planner: Generates and optimizes the execution plans.
Executor: Executes the plans with operators.
After receiving a request, the statements will be parsed by Parser composed of Flex (lexical analysis tool) and Bison (syntax analysis tool), and its corresponding AST will be generated. Statements will be directly intercepted in this stage because of their invalid syntax.
For example, the structure of the AST of GO FROM \"Tim\" OVER like WHERE properties(edge).likeness > 8.0 YIELD dst(edge)
is shown in the following figure.
Validator performs a series of validations on the AST. It mainly works on these tasks:
Validator will validate whether the metadata is correct or not.
When parsing the OVER
, WHERE
, and YIELD
clauses, Validator looks up the Schema and verifies whether the edge type and tag data exist or not. For an INSERT
statement, Validator verifies whether the types of the inserted data are the same as the ones defined in the Schema.
Validator will verify whether the cited variable exists or not, or whether the cited property is variable or not.
For composite statements, like $var = GO FROM \"Tim\" OVER like YIELD dst(edge) AS ID; GO FROM $var.ID OVER serve YIELD dst(edge)
, Validator verifies first to see if var
is defined, and then to check if the ID
property is attached to the var
variable.
Validator infers what type the result of an expression is and verifies the type against the specified clause.
For example, the WHERE
clause requires the result to be a bool
value, a NULL
value, or empty
.
*
Validator needs to verify all the Schema that involves *
when verifying the clause if there is a *
in the statement.
Take a statement like GO FROM \"Tim\" OVER * YIELD dst(edge), properties(edge).likeness, dst(edge)
as an example. When verifying the OVER
clause, Validator needs to verify all the edge types. If the edge type includes like
and serve
, the statement would be GO FROM \"Tim\" OVER like,serve YIELD dst(edge), properties(edge).likeness, dst(edge)
.
Validator will check the consistency of the clauses before and after the |
.
In the statement GO FROM \"Tim\" OVER like YIELD dst(edge) AS ID | GO FROM $-.ID OVER serve YIELD dst(edge)
, Validator will verify whether $-.ID
is defined in the clause before the |
.
When the validation succeeds, an execution plan will be generated. Its data structure will be stored in the src/planner
directory.
In the nebula-graphd.conf
file, when enable_optimizer
is set to be false
, Planner will not optimize the execution plans generated by Validator. It will be executed by Executor directly.
In the nebula-graphd.conf
file, when enable_optimizer
is set to be true
, Planner will optimize the execution plans generated by Validator. The structure is as follows.
In the execution plan on the right side of the preceding figure, each node directly depends on other nodes. For example, the root node Project
depends on the Filter
node, the Filter
node depends on the GetNeighbor
node, and so on, up to the leaf node Start
. Then the execution plan is (not truly) executed.
During this stage, every node has its input and output variables, which are stored in a hash table. The execution plan is not truly executed, so the value of each key in the associated hash table is empty (except for the Start
node, where the input variables hold the starting data), and the hash table is defined in src/context/ExecutionContext.cpp
under the nebula-graph
repository.
For example, if the hash table is named as ResultMap
when creating the Filter
node, users can determine that the node takes data from ResultMap[\"GN1\"]
, then puts the result into ResultMap[\"Filter2\"]
, and so on. All these work as the input and output of each node.
The optimization rules that Planner has implemented so far are considered RBO (Rule-Based Optimization), namely the pre-defined optimization rules. The CBO (Cost-Based Optimization) feature is under development. The optimized code is in the src/optimizer/
directory under the nebula-graph
repository.
RBO is a \u201cbottom-up\u201d exploration process. For each rule, the root node of the execution plan (in this case, the Project
node) is the entry point, and step by step along with the node dependencies, it reaches the node at the bottom to see if it matches the rule.
As shown in the preceding figure, when the Filter
node is explored, it is found that its children node is GetNeighbors
, which matches successfully with the pre-defined rules, so a transformation is initiated to integrate the Filter
node into the GetNeighbors
node, the Filter
node is removed, and then the process continues to the next rule. Therefore, when the GetNeighbor
operator calls interfaces of the Storage layer to get the neighboring edges of a vertex during the execution stage, the Storage layer will directly filter out the unqualified edges internally. Such optimization greatly reduces the amount of data transfer, which is commonly known as filter pushdown.
The Executor module consists of Scheduler and Executor. The Scheduler generates the corresponding execution operators against the execution plan, starting from the leaf nodes and ending at the root node. The structure is as follows.
Each node of the execution plan has one execution operator node, whose input and output have been determined in the execution plan. Each operator only needs to get the values for the input variables, compute them, and finally put the results into the corresponding output variables. Therefore, it is only necessary to execute step by step from Start
, and the result of the last operator is returned to the user as the final result.
The source code hierarchy under the nebula-graph repository is as follows.
|--src\n |--graph\n |--context //contexts for validation and execution\n |--executor //execution operators\n |--gc //garbage collector\n |--optimizer //optimization rules\n |--planner //structure of the execution plans\n |--scheduler //scheduler\n |--service //external service management\n |--session //session management\n |--stats //monitoring metrics\n |--util //basic components\n |--validator //validation of the statements\n |--visitor //visitor expression\n
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/","title":"Storage Service","text":"The persistent data of NebulaGraph have two parts. One is the Meta Service that stores the meta-related data.
The other is the Storage Service that stores the data, which is run by the nebula-storaged process. This topic will describe the architecture of the Storage Service.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#advantages","title":"Advantages","text":"The Storage Service is run by the nebula-storaged process. Users can deploy nebula-storaged processes on different occasions. For example, users can deploy 1 nebula-storaged process in a test environment and deploy 3 nebula-storaged processes in a production environment.
All the nebula-storaged processes consist of a Raft-based cluster. There are three layers in the Storage Service:
Storage interface
The top layer is the storage interface. It defines a set of APIs that are related to the graph concepts. These API requests will be translated into a set of KV operations targeting the corresponding Partition. For example:
getNeighbors
: queries the in-edge or out-edge of a set of vertices, returns the edges and the corresponding properties, and supports conditional filtering.insert vertex/edge
: inserts a vertex or edge and its properties.getProps
: gets the properties of a vertex or an edge.It is this layer that makes the Storage Service a real graph storage. Otherwise, it is just a KV storage.
Consensus
Below the storage interface is the consensus layer that implements Multi Group Raft, which ensures the strong consistency and high availability of the Storage Service.
Store engine
The bottom layer is the local storage engine library, providing operations like get
, put
, and scan
on local disks. The related interfaces are stored in KVStore.h
and KVEngine.h
files. You can develop your own local store plugins based on your needs.
The following will describe some features of the Storage Service based on the above architecture.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#storage_writing_process","title":"Storage writing process","text":""},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#kvstore","title":"KVStore","text":"NebulaGraph develops and customizes its built-in KVStore for the following reasons.
Therefore, NebulaGraph develops its own KVStore with RocksDB as the local storage engine. The advantages are as follows.
The Meta Service manages all the Storage servers. All the partition distribution data and current machine status can be found in the meta service. Accordingly, users can execute a manual load balancing plan in meta service.
Note
NebulaGraph does not support auto load balancing because auto data transfer will affect online business.
Graphs consist of vertices and edges. NebulaGraph uses key-value pairs to store vertices, edges, and their properties. Vertices and edges are stored in keys and their properties are stored in values. Such structure enables efficient property filtering.
The storage structure of vertices
Different from NebulaGraph version 2.x, version 3.x added a new key for each vertex. Compared to the old key that still exists, the new key has no TagID
field and no value. Vertices in NebulaGraph can now live without tags owing to the new key.
Type
One byte, used to indicate the key type. PartID
Three bytes, used to indicate the sharding partition and to scan the partition data based on the prefix when re-balancing the partition. VertexID
The vertex ID. For an integer VertexID, it occupies eight bytes. However, for a string VertexID, it is changed to fixed_string
of a fixed length which needs to be specified by users when they create the space. TagID
Four bytes, used to indicate the tags that vertex relate with. SerializedValue
The serialized value of the key. It stores the property information of the vertex. Type
One byte, used to indicate the key type. PartID
Three bytes, used to indicate the partition ID. This field can be used to scan the partition data based on the prefix when re-balancing the partition. VertexID
Used to indicate vertex ID. The former VID refers to the source VID in the outgoing edge and the dest VID in the incoming edge, while the latter VID refers to the dest VID in the outgoing edge and the source VID in the incoming edge. Edge Type
Four bytes, used to indicate the edge type. Greater than zero indicates out-edge, less than zero means in-edge. Rank
Eight bytes, used to indicate multiple edges in one edge type. Users can set the field based on needs and store weight, such as transaction time and transaction number. PlaceHolder
One byte. Reserved. SerializedValue
The serialized value of the key. It stores the property information of the edge. NebulaGraph uses strong-typed Schema.
NebulaGraph will store the properties of vertex and edges in order after encoding them. Since the length of fixed-length properties is fixed, queries can be made in no time according to offset. Before decoding, NebulaGraph needs to get (and cache) the schema information in the Meta Service. In addition, when encoding properties, NebulaGraph will add the corresponding schema version to support online schema change.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#data_partitioning","title":"Data partitioning","text":"Since in an ultra-large-scale relational network, vertices can be as many as tens to hundreds of billions, and edges are even more than trillions. Even if only vertices and edges are stored, the storage capacity of both exceeds that of ordinary servers. Therefore, NebulaGraph uses hash to shard the graph elements and store them in different partitions.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#edge_partitioning_and_storage_amplification","title":"Edge partitioning and storage amplification","text":"In NebulaGraph, an edge corresponds to two key-value pairs on the hard disk. When there are lots of edges and each has many properties, storage amplification will be obvious. The storage format of edges is shown in the figure below.
In this example, SrcVertex connects DstVertex via EdgeA, forming the path of (SrcVertex)-[EdgeA]->(DstVertex)
. SrcVertex, DstVertex, and EdgeA will all be stored in Partition x and Partition y as four key-value pairs in the storage layer. Details are as follows:
EdgeA_Out and EdgeA_In are stored in storage layer with opposite directions, constituting EdgeA logically. EdgeA_Out is used for traversal requests starting from SrcVertex, such as (a)-[]->()
; EdgeA_In is used for traversal requests starting from DstVertex, such as ()-[]->(a)
.
Like EdgeA_Out and EdgeA_In, NebulaGraph redundantly stores the information of each edge, which doubles the actual capacities needed for edge storage. The key corresponding to the edge occupies a small hard disk space, but the space occupied by Value is proportional to the length and amount of the property value. Therefore, it will occupy a relatively large hard disk space if the property value of the edge is large or there are many edge property values.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#partition_algorithm","title":"Partition algorithm","text":"NebulaGraph uses a static Hash strategy to shard data through a modulo operation on vertex ID. All the out-keys, in-keys, and tag data will be placed in the same partition. In this way, query efficiency is increased dramatically.
Note
The number of partitions needs to be determined when users are creating a graph space since it cannot be changed afterward. Users are supposed to take into consideration the demands of future business when setting it.
When inserting into NebulaGraph, vertices and edges are distributed across different partitions. And the partitions are located on different machines. The number of partitions is set in the CREATE SPACE statement and cannot be changed afterward.
If certain vertices need to be placed on the same partition (i.e., on the same machine), see Formula/code.
The following code will briefly describe the relationship between VID and partition.
// If VertexID occupies 8 bytes, it will be stored in int64 to be compatible with the version 1.0.\nuint64_t vid = 0;\nif (id.size() == 8) {\n memcpy(static_cast<void*>(&vid), id.data(), 8);\n} else {\n MurmurHash2 hash;\n vid = hash(id.data());\n}\nPartitionID pId = vid % numParts + 1;\n
Roughly speaking, after hashing a fixed string to int64, (the hashing of int64 is the number itself), do modulo, and then plus one, namely:
pId = vid % numParts + 1;\n
Parameters and descriptions of the preceding formula are as follows:
Parameter Description%
The modulo operation. numParts
The number of partitions for the graph space where the VID
is located, namely the value of partition_num
in the CREATE SPACE statement. pId
The ID for the partition where the VID
is located. Suppose there are 100 partitions, the vertices with VID
1, 101, and 1001 will be stored on the same partition. But, the mapping between the partition ID and the machine address is random. Therefore, we cannot assume that any two partitions are located on the same machine.
In a distributed system, one data usually has multiple replicas so that the system can still run normally even if a few copies fail. It requires certain technical means to ensure consistency between replicas.
Basic principle: Raft is designed to ensure consistency between replicas. Raft uses election between replicas, and the (candidate) replica that wins more than half of the votes will become the Leader, providing external services on behalf of all replicas. The rest Followers will play backups. When the Leader fails (due to communication failure, operation and maintenance commands, etc.), the rest Followers will conduct a new round of elections and vote for a new Leader. The Leader and Followers will detect each other's survival through heartbeats and write them to the hard disk in Raft-wal mode. Replicas that do not respond to more than multiple heartbeats will be considered faulty.
Note
Raft-wal needs to be written into the hard disk periodically. If hard disk bottlenecks to write, Raft will fail to send a heartbeat and conduct a new round of elections. If the hard disk IO is severely blocked, there will be no Leader for a long time.
Read and write: For every writing request of the clients, the Leader will initiate a Raft-wal and synchronize it with the Followers. Only after over half replicas have received the Raft-wal will it return to the clients successfully. For every reading request of the clients, it will get to the Leader directly, while Followers will not be involved.
Failure: Scenario 1: Take a (space) cluster of a single replica as an example. If the system has only one replica, the Leader will be itself. If failure happens, the system will be completely unavailable. Scenario 2: Take a (space) cluster of three replicas as an example. If the system has three replicas, one of them will be the Leader and the rest will be the Followers. If the Leader fails, the rest two can still vote for a new Leader (and a Follower), and the system is still available. But if any of the two Followers fails again, the system will be completely unavailable due to inadequate voters.
Note
Raft and HDFS have different modes of duplication. Raft is based on a quorum vote, so the number of replicas cannot be even.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#multi_group_raft","title":"Multi Group Raft","text":"The Storage Service supports a distributed cluster architecture, so NebulaGraph implements Multi Group Raft according to Raft protocol. Each Raft group stores all the replicas of each partition. One replica is the leader, while others are followers. In this way, NebulaGraph achieves strong consistency and high availability. The functions of Raft are as follows.
NebulaGraph uses Multi Group Raft to improve performance when there are many partitions because Raft-wal cannot be NULL. When there are too many partitions, costs will increase, such as storing information in Raft group, WAL files, or batch operation in low load.
There are two key points to implement the Multi Raft Group:
To share transport layer
Each Raft Group sends messages to its corresponding peers. So if the transport layer cannot be shared, the connection costs will be very high.
To share thread pool
Raft Groups share the same thread pool to prevent starting too many threads and a high context switch cost.
For each partition, it is necessary to do a batch to improve throughput when writing the WAL serially. As NebulaGraph uses WAL to implement some special functions, batches need to be grouped, which is a feature of NebulaGraph.
For example, lock-free CAS operations will execute after all the previous WALs are committed. So for a batch, if there are several WALs in CAS type, we need to divide this batch into several smaller groups and make sure they are committed serially.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#transfer_leadership","title":"Transfer Leadership","text":"Transfer leadership is extremely important for balance. When moving a partition from one machine to another, NebulaGraph first checks if the source is a leader. If so, it should be moved to another peer. After data migration is completed, it is important to balance leader distribution again.
When a transfer leadership command is committed, the leader will abandon its leadership and the followers will start a leader election.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#peer_changes","title":"Peer changes","text":"To avoid split-brain, when members in a Raft Group change, an intermediate state is required. In such a state, the quorum of the old group and new group always have an overlap. Thus it prevents the old or new group from making decisions unilaterally. To make it even simpler, in his doctoral thesis Diego Ongaro suggests adding or removing a peer once to ensure the overlap between the quorum of the new group and the old group. NebulaGraph also uses this approach, except that the way to add or remove a member is different. For details, please refer to addPeer/removePeer in the Raft Part class.
"},{"location":"1.introduction/3.nebula-graph-architecture/4.storage-service/#differences_with_hdfs","title":"Differences with HDFS","text":"The Storage Service is a Raft-based distributed architecture, which has certain differences with that of HDFS. For example:
In a word, the Storage Service is more lightweight with some functions simplified and its architecture is simpler than HDFS, which can effectively improve the read and write performance of a smaller block of data.
"},{"location":"14.client/1.nebula-client/","title":"Clients overview","text":"NebulaGraph supports multiple types of clients for users to connect to and manage the NebulaGraph database.
Note
Only the following classes are thread-safe:
NebulaGraph CPP is a C++ client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/3.nebula-cpp-client/#prerequisites","title":"Prerequisites","text":"You have installed C++ and GCC 4.8 or later versions.
"},{"location":"14.client/3.nebula-cpp-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/3.nebula-cpp-client/#install_nebulagraph_cpp","title":"Install NebulaGraph CPP","text":"This document describes how to install NebulaGraph CPP with the source code.
"},{"location":"14.client/3.nebula-cpp-client/#prerequisites_1","title":"Prerequisites","text":"Clone the NebulaGraph CPP source code to the host.
(Recommended) To install a specific version of NebulaGraph CPP, use the Git option --branch
to specify the branch. For example, to install v3.4.0, run the following command:
$ git clone --branch release-3.4 https://github.com/vesoft-inc/nebula-cpp.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-cpp.git\n
Change the working directory to nebula-cpp
.
$ cd nebula-cpp\n
Create a directory named build
and change the working directory to it.
$ mkdir build && cd build\n
Generate the makefile
file with CMake.
Note
The default installation path is /usr/local/nebula
. To modify it, add the -DCMAKE_INSTALL_PREFIX=<installation_path>
option while running the following command.
$ cmake -DCMAKE_BUILD_TYPE=Release ..\n
Note
If G++ does not support C++ 11, add the option -DDISABLE_CXX11_ABI=ON
.
Compile NebulaGraph CPP.
To speed up the compiling, use the -j
option to set a concurrent number N
. It should be \\(\\min(\\text{CPU core number},\\frac{\\text{the memory size(GB)}}{2})\\).
$ make -j{N}\n
Install NebulaGraph CPP.
$ sudo make install\n
Update the dynamic link library.
$ sudo ldconfig\n
Compile the CPP file to an executable file, then you can use it. The following steps take using SessionExample.cpp
for example.
Use the example code to create the SessionExample.cpp
file.
Run the following command to compile the file.
$ LIBRARY_PATH=<library_folder_path>:$LIBRARY_PATH g++ -std=c++11 SessionExample.cpp -I<include_folder_path> -lnebula_graph_client -o session_example\n
library_folder_path
: The storage path of the NebulaGraph dynamic libraries. The default path is /usr/local/nebula/lib64
.include_folder_path
: The storage of the NebulaGraph header files. The default path is /usr/local/nebula/include
.For example:
$ LIBRARY_PATH=/usr/local/nebula/lib64:$LIBRARY_PATH g++ -std=c++11 SessionExample.cpp -I/usr/local/nebula/include -lnebula_graph_client -o session_example\n
"},{"location":"14.client/3.nebula-cpp-client/#api_reference","title":"API reference","text":"Click here to check the classes and functions provided by the CPP Client.
"},{"location":"14.client/3.nebula-cpp-client/#core_of_the_example_code","title":"Core of the example code","text":"Nebula CPP clients provide both Session Pool and Connection Pool methods to connect to NebulaGraph. Using the Connection Pool method requires users to manage session instances by themselves.
Session Pool
For more details about all the code, see SessionPoolExample.
Connection Pool
For more details about all the code, see SessionExample.
NebulaGraph Java is a Java client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/4.nebula-java-client/#prerequisites","title":"Prerequisites","text":"You have installed Java 8.0 or later versions.
"},{"location":"14.client/4.nebula-java-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/4.nebula-java-client/#download_nebulagraph_java","title":"Download NebulaGraph Java","text":"(Recommended) To install a specific version of NebulaGraph Java, use the Git option --branch
to specify the branch. For example, to install v3.6.1, run the following command:
$ git clone --branch release-3.6 https://github.com/vesoft-inc/nebula-java.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-java.git\n
Note
We recommend that each thread use one session. If multiple threads use the same session, the performance will be reduced.
When importing a Maven project with tools such as IDEA, set the following dependency in pom.xml
.
Note
3.0.0-SNAPSHOT
indicates the daily development version that may have unknown issues. We recommend that you replace 3.0.0-SNAPSHOT
with a released version number to use a table version.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>client</artifactId>\n <version>3.0.0-SNAPSHOT</version>\n</dependency>\n
If you cannot download the dependency for the daily development version, set the following content in pom.xml
. Released versions have no such issue.
<repositories> \n <repository> \n <id>snapshots</id> \n <url>https://oss.sonatype.org/content/repositories/snapshots/</url> \n </repository> \n</repositories>\n
If there is no Maven to manage the project, manually download the JAR file to install NebulaGraph Java.
"},{"location":"14.client/4.nebula-java-client/#api_reference","title":"API reference","text":"Click here to check the classes and functions provided by the Java Client.
"},{"location":"14.client/4.nebula-java-client/#core_of_the_example_code","title":"Core of the example code","text":"The NebulaGraph Java client provides both Connection Pool and Session Pool modes, using Connection Pool requires the user to manage session instances.
Session Pool
For all the code, see GraphSessionPoolExample.
Connection Pool
For all the code, see GraphClientExample.
NebulaGraph Python is a Python client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/5.nebula-python-client/#prerequisites","title":"Prerequisites","text":"You have installed Python 3.6 or later versions.
"},{"location":"14.client/5.nebula-python-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python","title":"Install NebulaGraph Python","text":""},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python_with_pip","title":"Install NebulaGraph Python with pip","text":"$ pip install nebula3-python==<version>\n
"},{"location":"14.client/5.nebula-python-client/#install_nebulagraph_python_from_the_source_code","title":"Install NebulaGraph Python from the source code","text":"Clone the NebulaGraph Python source code to the host.
(Recommended) To install a specific version of NebulaGraph Python, use the Git option --branch
to specify the branch. For example, to install v3.4.0, run the following command:
$ git clone --branch release-3.4 https://github.com/vesoft-inc/nebula-python.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-python.git\n
Change the working directory to nebula-python.
$ cd nebula-python\n
Run the following command to install NebulaGraph Python.
$ pip install .\n
Click here to check the classes and functions provided by the Python Client.
"},{"location":"14.client/5.nebula-python-client/#core_of_the_example_code","title":"Core of the example code","text":"NebulaGraph Python clients provides Connection Pool and Session Pool methods to connect to NebulaGraph. Using the Connection Pool method requires users to manage sessions by themselves.
Session Pool
For details about all the code, see SessinPoolExample.py.
For limitations of using the Session Pool method, see Example of using session pool.
Connection Pool
For details about all the code, see Example.
NebulaGraph Go is a Golang client for connecting to and managing the NebulaGraph database.
"},{"location":"14.client/6.nebula-go-client/#prerequisites","title":"Prerequisites","text":"You have installed Golang 1.13 or later versions.
"},{"location":"14.client/6.nebula-go-client/#compatibility_with_nebulagraph","title":"Compatibility with NebulaGraph","text":"See github.
"},{"location":"14.client/6.nebula-go-client/#download_nebulagraph_go","title":"Download NebulaGraph Go","text":"(Recommended) To install a specific version of NebulaGraph Go, use the Git option --branch
to specify the branch. For example, to install v3.7.0, run the following command:
$ git clone --branch release-3.7 https://github.com/vesoft-inc/nebula-go.git\n
To install the daily development version, run the following command to download the source code from the master
branch:
$ git clone https://github.com/vesoft-inc/nebula-go.git\n
Run the following command to install or update NebulaGraph Go:
$ go get -u -v github.com/vesoft-inc/nebula-go/v3@v3.7.0\n
"},{"location":"14.client/6.nebula-go-client/#api_reference","title":"API reference","text":"Click here to check the functions and types provided by the GO Client.
"},{"location":"14.client/6.nebula-go-client/#core_of_the_example_code","title":"Core of the example code","text":"The NebulaGraph GO client provides both Connection Pool and Session Pool, using Connection Pool requires the user to manage the session instances.
Session Pool
For details about all the code, see session_pool_example.go.
For limitations of using Session Pool, see Usage example.
Connection Pool
For all the code, see graph_client_basic_example and graph_client_goroutines_example.
You can use the following clients developed by community users to connect to and manage NebulaGraph:
You are welcome to contribute any code or files to the project. But firstly we suggest you raise an issue on the github or the forum to start a discussion with the community. Check through the topic for Github.
"},{"location":"15.contribution/how-to-contribute/#sign_the_contributor_license_agreement_cla","title":"Sign the Contributor License Agreement CLA","text":"If you have any questions, submit an issue.
"},{"location":"15.contribution/how-to-contribute/#modify_a_single_document","title":"Modify a single document","text":"This manual is written in the Markdown language. Click the pencil
icon on the right of the document title to commit the modification.
This method applies to modifying a single document only.
"},{"location":"15.contribution/how-to-contribute/#batch_modify_or_add_files","title":"Batch modify or add files","text":"This method applies to contributing code, modifying multiple documents in batches, or adding new documents.
"},{"location":"15.contribution/how-to-contribute/#step_1_fork_in_the_githubcom","title":"Step 1: Fork in the github.com","text":"The NebulaGraph project has many repositories. Take the nebul repository for example:
Visit https://github.com/vesoft-inc/nebula.
Click the Fork
button to establish an online fork.
Define a local working directory.
# Define the working directory.\nworking_dir=$HOME/Workspace\n
Set user
to match the Github profile name.
user={the Github profile name}\n
Create your clone.
mkdir -p $working_dir\ncd $working_dir\ngit clone https://github.com/$user/nebula.git\n# or: git clone git@github.com:$user/nebula.git\n\ncd $working_dir/nebula\ngit remote add upstream https://github.com/vesoft-inc/nebula.git\n# or: git remote add upstream git@github.com:vesoft-inc/nebula.git\n\n# Never push to upstream master since you do not have write access.\ngit remote set-url --push upstream no_push\n\n# Confirm that the remote branch is valid.\n# The correct format is:\n# origin git@github.com:$(user)/nebula.git (fetch)\n# origin git@github.com:$(user)/nebula.git (push)\n# upstream https://github.com/vesoft-inc/nebula (fetch)\n# upstream no_push (push)\ngit remote -v\n
(Optional) Define a pre-commit hook.
Please link the NebulaGraph pre-commit hook into the .git
directory.
This hook checks the commits for formatting, building, doc generation, etc.
cd $working_dir/nebula/.git/hooks\nln -s $working_dir/nebula/.linters/cpp/hooks/pre-commit.sh .\n
Sometimes, the pre-commit hook cannot be executed. You have to execute it manually.
cd $working_dir/nebula/.git/hooks\nchmod +x pre-commit\n
Get your local master up to date.
cd $working_dir/nebula\ngit fetch upstream\ngit checkout master\ngit rebase upstream/master\n
Checkout a new branch from master.
git checkout -b myfeature\n
Note
Because the PR often consists of several commits, which might be squashed while being merged into upstream. We strongly suggest you to open a separate topic branch to make your changes on. After merged, this topic branch can be just abandoned, thus you could synchronize your master branch with upstream easily with a rebase like above. Otherwise, if you commit your changes directly into master, you need to use a hard reset on the master branch. For example:
git fetch upstream\ngit checkout master\ngit reset --hard upstream/master\ngit push --force origin master\n
Code style
NebulaGraph adopts cpplint
to make sure that the project conforms to Google's coding style guides. The checker will be implemented before the code is committed.
Unit tests requirements
Please add unit tests for the new features or bug fixes.
Build your code with unit tests enabled
For more information, see Install NebulaGraph by compiling the source code.
Note
Make sure you have enabled the building of unit tests by setting -DENABLE_TESTING=ON
.
Run tests
In the root directory of nebula
, run the following command:
cd nebula/build\nctest -j$(nproc)\n
# While on your myfeature branch.\ngit fetch upstream\ngit rebase upstream/master\n
Users need to bring the head branch up to date after other contributors merge PR to the base branch.
"},{"location":"15.contribution/how-to-contribute/#step_6_commit","title":"Step 6: Commit","text":"Commit your changes.
git commit -a\n
Users can use the command --amend
to re-edit the previous code.
When ready to review or just to establish an offsite backup, push your branch to your fork on github.com
:
git push origin myfeature\n
"},{"location":"15.contribution/how-to-contribute/#step_8_create_a_pull_request","title":"Step 8: Create a Pull Request","text":"Visit your fork at https://github.com/$user/nebula
(replace $user
here).
Click the Compare & pull request
button next to your myfeature
branch.
Once your pull request has been created, it will be assigned to at least two reviewers. Those reviewers will do a thorough code review to make sure that the changes meet the repository's contributing guidelines and other quality standards.
"},{"location":"15.contribution/how-to-contribute/#add_test_cases","title":"Add test cases","text":"For detailed methods, see How to add test cases.
"},{"location":"15.contribution/how-to-contribute/#donation","title":"Donation","text":""},{"location":"15.contribution/how-to-contribute/#step_1_confirm_the_project_donation","title":"Step 1: Confirm the project donation","text":"Contact the official NebulaGraph staff via email, WeChat, Slack, etc. to confirm the donation project. The project will be donated to the NebulaGraph Contrib organization.
Email address: info@vesoft.com
WeChat: NebulaGraphbot
Slack: Join Slack
"},{"location":"15.contribution/how-to-contribute/#step_2_get_the_information_of_the_project_recipient","title":"Step 2: Get the information of the project recipient","text":"The NebulaGraph official staff will give the recipient ID of the NebulaGraph Contrib project.
"},{"location":"15.contribution/how-to-contribute/#step_3_donate_a_project","title":"Step 3: Donate a project","text":"The user transfers the project to the recipient of this donation, and the recipient transfers the project to the NebulaGraph Contrib organization. After the donation, the user will continue to lead the development of community projects as a Maintainer.
For operations of transferring a repository on GitHub, see Transferring a repository owned by your user account.
"},{"location":"2.quick-start/1.quick-start-workflow/","title":"Quickly deploy NebulaGraph using Docker","text":"You can quickly get started with NebulaGraph by deploying NebulaGraph with Docker Desktop or Docker Compose.
Using Docker DesktopUsing Docker ComposeNebulaGraph is available as a Docker Extension that you can easily install and run on your Docker Desktop. You can quickly deploy NebulaGraph using Docker Desktop with just one click.
Install Docker Desktop.
Caution
We do not recommend you deploy NebulaGraph on Docker Desktop for Windows due to its subpar performance. For details, see #12401. If you must use Docker Desktop for Windows, install WSL 2 first.
In the left sidebar of Docker Desktop, click Extensions or Add Extensions.
On the Extensions Marketplace, search for NebulaGraph and click Install.
Click Update to update NebulaGraph to the latest version when a new version is available.
Click Open to navigate to the NebulaGraph extension page.
At the top of the page, click Studio in Browser to use NebulaGraph.
For more information about how to use NebulaGraph with Docker Desktop, see the following video:
Using Docker Compose can quickly deploy NebulaGraph services based on the prepared configuration file. It is only recommended to use this method when testing the functions of NebulaGraph.
"},{"location":"2.quick-start/1.quick-start-workflow/#prerequisites","title":"Prerequisites","text":"You have installed the following applications on your host.
Application Recommended version Official installation reference Docker Latest Install Docker Engine Docker Compose Latest Install Docker Compose Git Latest Download Gitnebula-docker-compose/data
directory.Clone the 3.6.0
branch of the nebula-docker-compose
repository to your host with Git.
Danger
The master
branch contains the untested code for the latest NebulaGraph development release. DO NOT use this release in a production environment.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-docker-compose.git\n
Note
The x.y
version of Docker Compose aligns to the x.y
version of NebulaGraph. For the NebulaGraph z
version, Docker Compose does not publish the corresponding z
version, but pulls the z
version of the NebulaGraph image.
Go to the nebula-docker-compose
directory.
$ cd nebula-docker-compose/\n
Run the following command to start all the NebulaGraph services.
Note
[nebula-docker-compose]$ docker-compose up -d\nCreating nebula-docker-compose_metad0_1 ... done\nCreating nebula-docker-compose_metad2_1 ... done\nCreating nebula-docker-compose_metad1_1 ... done\nCreating nebula-docker-compose_graphd2_1 ... done\nCreating nebula-docker-compose_graphd_1 ... done\nCreating nebula-docker-compose_graphd1_1 ... done\nCreating nebula-docker-compose_storaged0_1 ... done\nCreating nebula-docker-compose_storaged2_1 ... done\nCreating nebula-docker-compose_storaged1_1 ... done\n
Compatibility
Starting from NebulaGraph version 3.1.0, nebula-docker-compose automatically starts a NebulaGraph Console docker container and adds the storage host to the cluster (i.e. ADD HOSTS
command).
Note
For more information of the preceding services, see NebulaGraph architecture.
There are two ways to connect to NebulaGraph:
9669
in the container's configuration file, you can connect directly through the default port. For details, see Connect to NebulaGraph.Run the following command to view the name of NebulaGraph Console docker container.
$ docker-compose ps\n Name Command State Ports\n--------------------------------------------------------------------------------------------\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\n......\n
Run the following command to enter the NebulaGraph Console docker container.
docker exec -it nebula-docker-compose_console_1 /bin/sh\n/ #\n
Connect to NebulaGraph with NebulaGraph Console.
/ # ./usr/local/bin/nebula-console -u <user_name> -p <password> --address=graphd --port=9669\n
Note
By default, the authentication is off, you can only log in with an existing username (the default is root
) and any password. To turn it on, see Enable authentication.
Run the following commands to view the cluster state.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n
Run exit
twice to switch back to your terminal (shell).
Run docker-compose ps
to list all the services of NebulaGraph and their status and ports.
Note
NebulaGraph provides services to the clients through port 9669
by default. To use other ports, modify the docker-compose.yaml
file in the nebula-docker-compose
directory and restart the NebulaGraph services.
$ docker-compose ps\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49174->19669/tcp,:::49174->19669/tcp, 0.0.0.0:49171->19670/tcp,:::49171->19670/tcp, 0.0.0.0:49177->9669/tcp,:::49177->9669/tcp\nnebula-docker-compose_graphd2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49175->19669/tcp,:::49175->19669/tcp, 0.0.0.0:49172->19670/tcp,:::49172->19670/tcp, 0.0.0.0:49178->9669/tcp,:::49178->9669/tcp\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49180->19669/tcp,:::49180->19669/tcp, 0.0.0.0:49179->19670/tcp,:::49179->19670/tcp, 0.0.0.0:9669->9669/tcp,:::9669->9669/tcp\nnebula-docker-compose_metad0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49157->19559/tcp,:::49157->19559/tcp, 0.0.0.0:49154->19560/tcp,:::49154->19560/tcp, 0.0.0.0:49160->9559/tcp,:::49160->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49156->19559/tcp,:::49156->19559/tcp, 0.0.0.0:49153->19560/tcp,:::49153->19560/tcp, 0.0.0.0:49159->9559/tcp,:::49159->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49158->19559/tcp,:::49158->19559/tcp, 0.0.0.0:49155->19560/tcp,:::49155->19560/tcp, 0.0.0.0:49161->9559/tcp,:::49161->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49166->19779/tcp,:::49166->19779/tcp, 0.0.0.0:49163->19780/tcp,:::49163->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49169->9779/tcp,:::49169->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49165->19779/tcp,:::49165->19779/tcp, 0.0.0.0:49162->19780/tcp,:::49162->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49168->9779/tcp,:::49168->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49167->19779/tcp,:::49167->19779/tcp, 0.0.0.0:49164->19780/tcp,:::49164->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49170->9779/tcp,:::49170->9779/tcp, 9780/tcp\n
If the service is abnormal, you can first confirm the abnormal container name (such as nebula-docker-compose_graphd2_1
).
Then you can execute docker ps
to view the corresponding CONTAINER ID
(such as 2a6c56c405f5
).
[nebula-docker-compose]$ docker ps\nCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES\n2a6c56c405f5 vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:49230->9669/tcp, 0.0.0.0:49229->19669/tcp, 0.0.0.0:49228->19670/tcp nebula-docker-compose_graphd2_1\n7042e0a8e83d vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49227->9779/tcp, 0.0.0.0:49226->19779/tcp, 0.0.0.0:49225->19780/tcp nebula-docker-compose_storaged2_1\n18e3ea63ad65 vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49219->9779/tcp, 0.0.0.0:49218->19779/tcp, 0.0.0.0:49217->19780/tcp nebula-docker-compose_storaged0_1\n4dcabfe8677a vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:49224->9669/tcp, 0.0.0.0:49223->19669/tcp, 0.0.0.0:49222->19670/tcp nebula-docker-compose_graphd1_1\na74054c6ae25 vesoft/nebula-graphd:nightly \"/usr/local/nebula/b\u2026\" 36 minutes ago Up 36 minutes (healthy) 0.0.0.0:9669->9669/tcp, 0.0.0.0:49221->19669/tcp, 0.0.0.0:49220->19670/tcp nebula-docker-compose_graphd_1\n880025a3858c vesoft/nebula-storaged:nightly \"./bin/nebula-storag\u2026\" 36 minutes ago Up 36 minutes (healthy) 9777-9778/tcp, 9780/tcp, 0.0.0.0:49216->9779/tcp, 0.0.0.0:49215->19779/tcp, 0.0.0.0:49214->19780/tcp nebula-docker-compose_storaged1_1\n45736a32a23a vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49213->9559/tcp, 0.0.0.0:49212->19559/tcp, 0.0.0.0:49211->19560/tcp nebula-docker-compose_metad0_1\n3b2c90eb073e vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49207->9559/tcp, 0.0.0.0:49206->19559/tcp, 0.0.0.0:49205->19560/tcp nebula-docker-compose_metad2_1\n7bb31b7a5b3f vesoft/nebula-metad:nightly \"./bin/nebula-metad \u2026\" 36 minutes ago Up 36 minutes (healthy) 9560/tcp, 0.0.0.0:49210->9559/tcp, 0.0.0.0:49209->19559/tcp, 0.0.0.0:49208->19560/tcp nebula-docker-compose_metad1_1\n
Use the CONTAINER ID
to log in the container and troubleshoot.
nebula-docker-compose]$ docker exec -it 2a6c56c405f5 bash\n[root@2a6c56c405f5 nebula]#\n
"},{"location":"2.quick-start/1.quick-start-workflow/#check_the_service_data_and_logs","title":"Check the service data and logs","text":"All the data and logs of NebulaGraph are stored persistently in the nebula-docker-compose/data
and nebula-docker-compose/logs
directories.
The structure of the directories is as follows:
nebula-docker-compose/\n |-- docker-compose.yaml\n \u251c\u2500\u2500 data\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta2\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 storage2\n \u2514\u2500\u2500 logs\n \u251c\u2500\u2500 graph\n \u251c\u2500\u2500 graph1\n \u251c\u2500\u2500 graph2\n \u251c\u2500\u2500 meta0\n \u251c\u2500\u2500 meta1\n \u251c\u2500\u2500 meta2\n \u251c\u2500\u2500 storage0\n \u251c\u2500\u2500 storage1\n \u2514\u2500\u2500 storage2\n
"},{"location":"2.quick-start/1.quick-start-workflow/#stop_the_nebulagraph_services","title":"Stop the NebulaGraph services","text":"You can run the following command to stop the NebulaGraph services:
$ docker-compose down\n
The following information indicates you have successfully stopped the NebulaGraph services:
Stopping nebula-docker-compose_console_1 ... done\nStopping nebula-docker-compose_graphd1_1 ... done\nStopping nebula-docker-compose_graphd_1 ... done\nStopping nebula-docker-compose_graphd2_1 ... done\nStopping nebula-docker-compose_storaged1_1 ... done\nStopping nebula-docker-compose_storaged0_1 ... done\nStopping nebula-docker-compose_storaged2_1 ... done\nStopping nebula-docker-compose_metad2_1 ... done\nStopping nebula-docker-compose_metad0_1 ... done\nStopping nebula-docker-compose_metad1_1 ... done\nRemoving nebula-docker-compose_console_1 ... done\nRemoving nebula-docker-compose_graphd1_1 ... done\nRemoving nebula-docker-compose_graphd_1 ... done\nRemoving nebula-docker-compose_graphd2_1 ... done\nRemoving nebula-docker-compose_storaged1_1 ... done\nRemoving nebula-docker-compose_storaged0_1 ... done\nRemoving nebula-docker-compose_storaged2_1 ... done\nRemoving nebula-docker-compose_metad2_1 ... done\nRemoving nebula-docker-compose_metad0_1 ... done\nRemoving nebula-docker-compose_metad1_1 ... done\nRemoving network nebula-docker-compose_nebula-net\n
Danger
The parameter -v
in the command docker-compose down -v
will delete all your local NebulaGraph storage data. Try this command if you are using the nightly release and having some compatibility issues.
The configuration file of NebulaGraph deployed by Docker Compose is nebula-docker-compose/docker-compose.yaml
. To make the new configuration take effect, modify the configuration in this file and restart the service.
For more instructions, see Configurations.
"},{"location":"2.quick-start/1.quick-start-workflow/#faq","title":"FAQ","text":""},{"location":"2.quick-start/1.quick-start-workflow/#how_to_fix_the_docker_mapping_to_external_ports","title":"How to fix the docker mapping to external ports?","text":"To set the ports
of corresponding services as fixed mapping, modify the docker-compose.yaml
in the nebula-docker-compose
directory. For example:
graphd:\n image: vesoft/nebula-graphd:release-3.6\n ...\n ports:\n - 9669:9669\n - 19669\n - 19670\n
9669:9669
indicates the internal port 9669 is uniformly mapped to external ports, while 19669
indicates the internal port 19669 is randomly mapped to external ports.
In the nebula-docker-compose/docker-compose.yaml
file, change all the image
values to the required image version.
In the nebula-docker-compose
directory, run docker-compose pull
to update the images of the Graph Service, Storage Service, Meta Service, and NebulaGraph Console.
Run docker-compose up -d
to start the NebulaGraph services again.
After connecting to NebulaGraph with NebulaGraph Console, run SHOW HOSTS GRAPH
, SHOW HOSTS STORAGE
, or SHOW HOSTS META
to check the version of the responding service respectively.
ERROR: toomanyrequests
when docker-compose pull
","text":"You may meet the following error.
ERROR: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
.
You have met the rate limit of Docker Hub. Learn more on Understanding Docker Hub Rate Limiting.
"},{"location":"2.quick-start/1.quick-start-workflow/#how_to_update_the_nebulagraph_console_client","title":"How to update the NebulaGraph Console client","text":"The command docker-compose pull
updates both the NebulaGraph services and the NebulaGraph Console.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"2.quick-start/2.install-nebula-graph/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
When connecting to NebulaGraph for the first time, you have to add the Storage hosts, and confirm that all the hosts are online.
Compatibility
ADD HOSTS
before reading or writing data into the Storage Service.ADD HOSTS
is not needed. You have connected to NebulaGraph.
"},{"location":"2.quick-start/3.1add-storage-hosts/#steps","title":"Steps","text":"Add the Storage hosts.
Run the following command to add hosts:
ADD HOSTS <ip>:<port> [,<ip>:<port> ...];\n
Example\uff1a
nebula> ADD HOSTS 192.168.10.100:9779, 192.168.10.101:9779, 192.168.10.102:9779;\n
Caution
Make sure that the IP you added is the same as the IP configured for local_ip
in the nebula-storaged.conf
file. Otherwise, the Storage service will fail to start. For information about configurations, see Configurations.
Check the status of the hosts to make sure that they are all online.
nebula> SHOW HOSTS;\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n| \"192.168.10.100\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.101\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\"|\n| \"192.168.10.102\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\"|\n+------------------+------+----------+--------------+---------------------- +------------------------+---------+\n
The Status
column of the result above shows that all Storage hosts are online.
This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"2.quick-start/3.connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
This topic will describe the basic CRUD operations in NebulaGraph.
For more information, see nGQL guide.
"},{"location":"2.quick-start/4.nebula-graph-crud/#graph_space_and_nebulagraph_schema","title":"Graph space and NebulaGraph schema","text":"A NebulaGraph instance consists of one or more graph spaces. Graph spaces are physically isolated from each other. You can use different graph spaces in the same instance to store different datasets.
To insert data into a graph space, define a schema for the graph database. NebulaGraph schema is based on the following components.
Schema component Description Vertex Represents an entity in the real world. A vertex can have zero to multiple tags. Tag The type of the same group of vertices. It defines a set of properties that describes the types of vertices. Edge Represents a directed relationship between two vertices. Edge type The type of an edge. It defines a group of properties that describes the types of edges.For more information, see Data modeling.
In this topic, we will use the following dataset to demonstrate basic CRUD operations.
"},{"location":"2.quick-start/4.nebula-graph-crud/#async_implementation_of_create_and_alter","title":"Async implementation ofCREATE
and ALTER
","text":"Caution
In NebulaGraph, the following CREATE
or ALTER
commands are implemented in an async way and take effect in the next heartbeat cycle. Otherwise, an error will be returned. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
CREATE SPACE
CREATE TAG
CREATE EDGE
ALTER TAG
ALTER EDGE
CREATE TAG INDEX
CREATE EDGE INDEX
Note
The default heartbeat interval is 10 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
CREATE SPACE [IF NOT EXISTS] <graph_space_name> (\n[partition_num = <partition_number>,]\n[replica_factor = <replica_number>,]\nvid_type = {FIXED_STRING(<N>) | INT64}\n)\n[COMMENT = '<comment>'];\n
For more information on parameters, see CREATE SPACE.
nebula> SHOW SPACES;\n
USE <graph_space_name>;\n
Use the following statement to create a graph space named basketballplayer
.
nebula> CREATE SPACE basketballplayer(partition_num=15, replica_factor=1, vid_type=fixed_string(30));\n
Note
If the system returns the error [ERROR (-1005)]: Host not enough!
, check whether registered the Storage Service.
Check the partition distribution with SHOW HOSTS
to make sure that the partitions are distributed in a balanced way.
nebula> SHOW HOSTS;\n+-------------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n| \"storaged1\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n| \"storaged2\" | 9779 | \"ONLINE\" | 5 | \"basketballplayer:5\" | \"basketballplayer:5\" | \"master\"|\n+-------------+-----------+-----------+-----------+--------------+----------------------------------+------------------------+---------+\n
If the Leader distribution is uneven, use BALANCE LEADER
to redistribute the partitions. For more information, see BALANCE.
Use the basketballplayer
graph space.
nebula[(none)]> USE basketballplayer;\n
You can use SHOW SPACES
to check the graph space you created.
nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"basketballplayer\" |\n+--------------------+\n
CREATE {TAG | EDGE} [IF NOT EXISTS] {<tag_name> | <edge_type_name>}\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
For more information on parameters, see CREATE TAG and CREATE EDGE.
"},{"location":"2.quick-start/4.nebula-graph-crud/#examples_1","title":"Examples","text":"Create tags player
and team
, and edge types follow
and serve
. Descriptions are as follows.
nebula> CREATE TAG player(name string, age int);\n\nnebula> CREATE TAG team(name string);\n\nnebula> CREATE EDGE follow(degree int);\n\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
"},{"location":"2.quick-start/4.nebula-graph-crud/#insert_vertices_and_edges","title":"Insert vertices and edges","text":"You can use the INSERT
statement to insert vertices or edges based on existing tags or edge types.
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES <vid>: ([prop_value_list])\n\ntag_props:\n tag_name ([prop_name_list])\n\nprop_name_list:\n [prop_name [, prop_name] ...]\n\nprop_value_list:\n [prop_value [, prop_value] ...] \n
vid
is short for Vertex ID. A vid
must be a unique string value in a graph space. For details, see INSERT VERTEX.
Insert edges:
INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES \n<src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> )\n[, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...];\n<prop_name_list> ::=\n[ <prop_name> [, <prop_name> ] ...]\n<prop_value_list> ::=\n[ <prop_value> [, <prop_value> ] ...]\n
For more information on parameters, see INSERT EDGE.
nebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\n\nnebula> INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36);\n\nnebula> INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\n\nnebula> INSERT VERTEX team(name) VALUES \"team203\":(\"Trail Blazers\"), \"team204\":(\"Spurs\");\n
nebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player100\":(95);\n\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player102\":(90);\n\nnebula> INSERT EDGE follow(degree) VALUES \"player102\" -> \"player100\":(75);\n\nnebula> INSERT EDGE serve(start_year, end_year) VALUES \"player101\" -> \"team204\":(1999, 2018),\"player102\" -> \"team203\":(2006, 2015);\n
GO
traversal starts from one or more vertices, along one or more edges, and returns information in a form specified in the YIELD
clause.WHERE
clause to search for the data that meet the specific conditions.GO
GO [[<M> TO] <N> {STEP|STEPS} ] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions> ]\nYIELD [DISTINCT] <return_list>\n[{ SAMPLE <sample_list> | <limit_by_list_clause> }]\n[| GROUP BY {<col_name> | expression> | <position>} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset>,] <number_rows>];\n
FETCH
Fetch properties on tags:
FETCH PROP ON {<tag_name>[, tag_name ...] | *}\n<vid> [, vid ...]\nYIELD <return_list> [AS <alias>];\n
Fetch properties on edges:
FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
LOOKUP
LOOKUP ON {<vertex_tag> | <edge_type>}\n[WHERE <expression> [AND <expression> ...]]\nYIELD <return_list> [AS <alias>];\n<return_list>\n <prop_name> [AS <col_alias>] [, <prop_name> [AS <prop_alias>] ...];\n
MATCH
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
GO
statement","text":"player101
follows.nebula> GO FROM \"player101\" OVER follow YIELD id($$);\n+-------------+\n| id($$) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n+-------------+\n
player101
follows whose age is equal to or greater than 35. Rename the corresponding columns in the results with Teammate
and Age
.nebula> GO FROM \"player101\" OVER follow WHERE properties($$).age >= 35 \\\n YIELD properties($$).name AS Teammate, properties($$).age AS Age;\n+-----------------+-----+\n| Teammate | Age |\n+-----------------+-----+\n| \"Tim Duncan\" | 42 |\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
| Clause/Sign | Description | |-------------+---------------------------------------------------------------------| | YIELD
| Specifies what values or results you want to return from the query. | | $$
| Represents the target vertices. | | \\
| A line-breaker. |
Search for the players that the player with VID player101
follows. Then retrieve the teams of the players that the player with VID player100
follows. To combine the two queries, use a pipe or a temporary variable.
With a pipe:
nebula> GO FROM \"player101\" OVER follow YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------------+---------------------+\n| Team | Player |\n+-----------------+---------------------+\n| \"Spurs\" | \"Tim Duncan\" |\n| \"Trail Blazers\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------------+---------------------+\n
Clause/Sign Description $^
Represents the source vertex of the edge. |
A pipe symbol can combine multiple queries. $-
Represents the outputs of the query before the pipe symbol. With a temporary variable:
Note
Once a composite statement is submitted to the server as a whole, the life cycle of the temporary variables in the statement ends.
nebula> $var = GO FROM \"player101\" OVER follow YIELD dst(edge) AS id; \\\n GO FROM $var.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------------+---------------------+\n| Team | Player |\n+-----------------+---------------------+\n| \"Spurs\" | \"Tim Duncan\" |\n| \"Trail Blazers\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"LaMarcus Aldridge\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------------+---------------------+\n
FETCH
statement","text":"Use FETCH
: Fetch the properties of the player with VID player100
.
nebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+-------------------------------+\n| properties(VERTEX) |\n+-------------------------------+\n| {age: 42, name: \"Tim Duncan\"} |\n+-------------------------------+\n
Note
The examples of LOOKUP
and MATCH
statements are in indexes.
Users can use the UPDATE
or the UPSERT
statements to update existing data.
UPSERT
is the combination of UPDATE
and INSERT
. If you update a vertex or an edge with UPSERT
, the database will insert a new vertex or edge if it does not exist.
Note
UPSERT
operates serially in a partition-based order. Therefore, it is slower than INSERT
OR UPDATE
. And UPSERT
has concurrency only between multiple partitions.
UPDATE
vertices:UPDATE VERTEX <vid> SET <properties to be updated>\n[WHEN <condition>] [YIELD <columns>];\n
UPDATE
edges:UPDATE EDGE ON <edge_type> <source vid> -> <destination vid> [@rank] \nSET <properties to be updated> [WHEN <condition>] [YIELD <columns to be output>];\n
UPSERT
vertices or edges:UPSERT {VERTEX <vid> | EDGE <edge_type>} SET <update_columns>\n[WHEN <condition>] [YIELD <columns>];\n
UPDATE
the name
property of the vertex with VID player100
and check the result with the FETCH
statement.nebula> UPDATE VERTEX \"player100\" SET player.name = \"Tim\";\n\nnebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+------------------------+\n| properties(VERTEX) |\n+------------------------+\n| {age: 42, name: \"Tim\"} |\n+------------------------+\n
UPDATE
the degree
property of an edge and check the result with the FETCH
statement.nebula> UPDATE EDGE ON follow \"player101\" -> \"player100\" SET degree = 96;\n\nnebula> FETCH PROP ON follow \"player101\" -> \"player100\" YIELD properties(edge);\n+------------------+\n| properties(EDGE) |\n+------------------+\n| {degree: 96} |\n+------------------+\n
player111
and UPSERT
it.nebula> INSERT VERTEX player(name,age) VALUES \"player111\":(\"David West\", 38);\n\nnebula> UPSERT VERTEX \"player111\" SET player.name = \"David\", player.age = $^.player.age + 11 \\\n WHEN $^.player.name == \"David West\" AND $^.player.age > 20 \\\n YIELD $^.player.name AS Name, $^.player.age AS Age;\n+---------+-----+\n| Name | Age |\n+---------+-----+\n| \"David\" | 49 |\n+---------+-----+\n
DELETE VERTEX <vid1>[, <vid2>...]\n
DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>]\n[, <src_vid> -> <dst_vid>...]\n
nebula> DELETE VERTEX \"player111\", \"team203\";\n
nebula> DELETE EDGE follow \"player101\" -> \"team204\";\n
Users can add indexes to tags and edge types with the CREATE INDEX statement.
Must-read for using indexes
Both MATCH
and LOOKUP
statements depend on the indexes. But indexes can dramatically reduce the write performance. DO NOT use indexes in production environments unless you are fully aware of their influences on your service.
Users MUST rebuild indexes for pre-existing data. Otherwise, the pre-existing data cannot be indexed and therefore cannot be returned in MATCH
or LOOKUP
statements. For more information, see REBUILD INDEX.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name>\nON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT = '<comment>'];\n
REBUILD {TAG | EDGE} INDEX <index_name>;\n
Note
Define the index length when creating an index for a variable-length property. In UTF-8 encoding, a non-ascii character occupies 3 bytes. You should set an appropriate index length according to the variable-length property. For example, the index should be 30 bytes for 10 non-ascii characters. For more information, see CREATE INDEX
"},{"location":"2.quick-start/4.nebula-graph-crud/#examples_of_lookup_and_match_index-based","title":"Examples ofLOOKUP
and MATCH
(index-based)","text":"Make sure there is an index for LOOKUP
or MATCH
to use. If there is not, create an index first.
Find the information of the vertex with the tag player
and its value of the name
property is Tony Parker
.
This example creates the index player_index_1
on the name
property.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_1 ON player(name(20));\n
This example rebuilds the index to make sure it takes effect on pre-existing data.
nebula> REBUILD TAG INDEX player_index_1\n+------------+\n| New Job Id |\n+------------+\n| 31 |\n+------------+\n
This example uses the LOOKUP
statement to retrieve the vertex property.
nebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name AS name, properties(vertex).age AS age;\n+---------------+-----+\n| name | age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n
This example uses the MATCH
statement to retrieve the vertex property.
nebula> MATCH (v:player{name:\"Tony Parker\"}) RETURN v;\n+-----------------------------------------------------+\n| v |\n+-----------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+-----------------------------------------------------+\n
"},{"location":"2.quick-start/5.start-stop-service/","title":"Step 2: Manage NebulaGraph Service","text":"NebulaGraph supports managing services with scripts.
"},{"location":"2.quick-start/5.start-stop-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"2.quick-start/5.start-stop-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"2.quick-start/5.start-stop-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"2.quick-start/5.start-stop-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"2.quick-start/6.cheatsheet-for-ngql/","title":"nGQL cheatsheet","text":""},{"location":"2.quick-start/6.cheatsheet-for-ngql/#functions","title":"Functions","text":"Math functions
Function Description double abs(double x) Returns the absolute value of the argument. double floor(double x) Returns the largest integer value smaller than or equal to the argument. (Rounds down) double ceil(double x) Returns the smallest integer greater than or equal to the argument. (Rounds up) double round(double x) Returns the integer value nearest to the argument. Returns a number farther away from 0 if the argument is in the middle. double sqrt(double x) Returns the square root of the argument. double cbrt(double x) Returns the cubic root of the argument. double hypot(double x, double y) Returns the hypotenuse of a right-angled triangle. double pow(double x, double y) Returns the result of xy. double exp(double x) Returns the result of ex. double exp2(double x) Returns the result of 2x. double log(double x) Returns the base-e logarithm of the argument. double log2(double x) Returns the base-2 logarithm of the argument. double log10(double x) Returns the base-10 logarithm of the argument. double sin(double x) Returns the sine of the argument. double asin(double x) Returns the inverse sine of the argument. double cos(double x) Returns the cosine of the argument. double acos(double x) Returns the inverse cosine of the argument. double tan(double x) Returns the tangent of the argument. double atan(double x) Returns the inverse tangent of the argument. double rand() Returns a random floating point number in the range from 0 (inclusive) to 1 (exclusive); i.e.[0,1). int rand32(int min, int max) Returns a random 32-bit integer in[min, max)
.If you set only one argument, it is parsed as max
and min
is 0
by default.If you set no argument, the system returns a random signed 32-bit integer. int rand64(int min, int max) Returns a random 64-bit integer in [min, max)
.If you set only one argument, it is parsed as max
and min
is 0
by default.If you set no argument, the system returns a random signed 64-bit integer. bit_and() Bitwise AND. bit_or() Bitwise OR. bit_xor() Bitwise XOR. int size() Returns the number of elements in a list or a map or the length of a string. int range(int start, int end, int step) Returns a list of integers from [start,end]
in the specified steps. step
is 1 by default. int sign(double x) Returns the signum of the given number.If the number is 0
, the system returns 0
.If the number is negative, the system returns -1
.If the number is positive, the system returns 1
. double e() Returns the base of the natural logarithm, e (2.718281828459045). double pi() Returns the mathematical constant pi (3.141592653589793). double radians() Converts degrees to radians. radians(180)
returns 3.141592653589793
. Aggregating functions
Function Description avg() Returns the average value of the argument. count() Syntax:count({expr | *})
.count()
returns the number of rows (including NULL). count(expr)
returns the number of non-NULL values that meet the expression. count() and size() are different. max() Returns the maximum value. min() Returns the minimum value. collect() The collect() function returns a list containing the values returned by an expression. Using this function aggregates data by merging multiple records or values into a single list. std() Returns the population standard deviation. sum() Returns the sum value. String functions
Function Description int strcasecmp(string a, string b) Compares string a and b without case sensitivity. When a = b, the return string lower(string a) Returns the argument in lowercase. string toLower(string a) The same aslower()
. string upper(string a) Returns the argument in uppercase. string toUpper(string a) The same as upper()
. int length(a) Returns the length of the given string in bytes or the length of a path in hops. string trim(string a) Removes leading and trailing spaces. string ltrim(string a) Removes leading spaces. string rtrim(string a) Removes trailing spaces. string left(string a, int count) Returns a substring consisting of count
characters from the left side of string right(string a, int count) Returns a substring consisting of count
characters from the right side of string lpad(string a, int size, string letters) Left-pads string a with string letters
and returns a string rpad(string a, int size, string letters) Right-pads string a with string letters
and returns a string substr(string a, int pos, int count) Returns a substring extracting count
characters starting from string substring(string a, int pos, int count) The same as substr()
. string reverse(string) Returns a string in reverse order. string replace(string a, string b, string c) Replaces string b in string a with string c. list split(string a, string b) Splits string a at string b and returns a list of strings. concat() The concat()
function requires at least two or more strings. All the parameters are concatenated into one string.Syntax: concat(string1,string2,...)
concat_ws() The concat_ws()
function connects two or more strings with a predefined separator. extract() extract()
uses regular expression matching to retrieve a single substring or all substrings from a string. json_extract() The json_extract()
function converts the specified JSON string to map. Data and time functions
Function Description int now() Returns the current timestamp of the system. timestamp timestamp() Returns the current timestamp of the system. date date() Returns the current UTC date based on the current system. time time() Returns the current UTC time based on the current system. datetime datetime() Returns the current UTC date and time based on the current system.Schema-related functions
For nGQL statements
Function Description id(vertex) Returns the ID of a vertex. The data type of the result is the same as the vertex ID. map properties(vertex) Returns the properties of a vertex. map properties(edge) Returns the properties of an edge. string type(edge) Returns the edge type of an edge. src(edge) Returns the source vertex ID of an edge. The data type of the result is the same as the vertex ID. dst(edge) Returns the destination vertex ID of an edge. The data type of the result is the same as the vertex ID. int rank(edge) Returns the rank value of an edge. vertex Returns the information of vertices, including VIDs, tags, properties, and values. edge Returns the information of edges, including edge types, source vertices, destination vertices, ranks, properties, and values. vertices Returns the information of vertices in a subgraph. For more information, see GET SUBGRAPH. edges Returns the information of edges in a subgraph. For more information, see GET SUBGRAPH. path Returns the information of a path. For more information, see FIND PATH.For statements compatible with openCypher
Function Description id(<vertex>) Returns the ID of a vertex. The data type of the result is the same as the vertex ID. list tags(<vertex>) Returns the Tag of a vertex, which serves the same purpose as labels(). list labels(<vertex>) Returns the Tag of a vertex, which serves the same purpose as tags(). This function is used for compatibility with openCypher syntax. map properties(<vertex_or_edge>) Returns the properties of a vertex or an edge. string type(<edge>) Returns the edge type of an edge. src(<edge>) Returns the source vertex ID of an edge. The data type of the result is the same as the vertex ID. dst(<edge>) Returns the destination vertex ID of an edge. The data type of the result is the same as the vertex ID. vertex startNode(<path>) Visits an edge or a path and returns its source vertex ID. string endNode(<path>) Visits an edge or a path and returns its destination vertex ID. int rank(<edge>) Returns the rank value of an edge.List functions
Function Description keys(expr) Returns a list containing the string representations for all the property names of vertices, edges, or maps. labels(vertex) Returns the list containing all the tags of a vertex. nodes(path) Returns the list containing all the vertices in a path. range(start, end [, step]) Returns the list containing all the fixed-length steps in[start,end]
. step
is 1 by default. relationships(path) Returns the list containing all the relationships in a path. reverse(list) Returns the list reversing the order of all elements in the original list. tail(list) Returns all the elements of the original list, excluding the first one. head(list) Returns the first element of a list. last(list) Returns the last element of a list. reduce() The reduce()
function applies an expression to each element in a list one by one, chains the result to the next iteration by taking it as the initial value, and returns the final result. Type conversion functions
Function Description bool toBoolean() Converts a string value to a boolean value. float toFloat() Converts an integer or string value to a floating point number. string toString() Converts non-compound types of data, such as numbers, booleans, and so on, to strings. int toInteger() Converts a floating point or string value to an integer value. set toSet() Converts a list or set value to a set value. int hash() Thehash()
function returns the hash value of the argument. The argument can be a number, a string, a list, a boolean, null, or an expression that evaluates to a value of the preceding data types. Predicate functions
Predicate functions return true
or false
. They are most commonly used in WHERE
clauses.
<predicate>(<variable> IN <list> WHERE <condition>)\n
Function Description exists() Returns true
if the specified property exists in the vertex, edge or map. Otherwise, returns false
. any() Returns true
if the specified predicate holds for at least one element in the given list. Otherwise, returns false
. all() Returns true
if the specified predicate holds for all elements in the given list. Otherwise, returns false
. none() Returns true
if the specified predicate holds for no element in the given list. Otherwise, returns false
. single() Returns true
if the specified predicate holds for exactly one of the elements in the given list. Otherwise, returns false
. Conditional expressions functions
Function Description CASE TheCASE
expression uses conditions to filter the result of an nGQL query statement. It is usually used in the YIELD
and RETURN
clauses. The CASE
expression will traverse all the conditions. When the first condition is met, the CASE
expression stops reading the conditions and returns the result. If no conditions are met, it returns the result in the ELSE
clause. If there is no ELSE
clause and no conditions are met, it returns NULL
. coalesce() Returns the first not null value in all expressions. MATCH
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
Pattern Example Description Match vertices (v)
You can use a user-defined variable in a pair of parentheses to represent a vertex in a pattern. For example: (v)
. Match tags MATCH (v:player) RETURN v
You can specify a tag with :<tag_name>
after the vertex in a pattern. Match multiple tags MATCH (v:player:team) RETURN v
To match vertices with multiple tags, use colons (:). Match vertex properties MATCH (v:player{name:\"Tim Duncan\"}) RETURN v
MATCH (v) WITH v, properties(v) as props, keys(properties(v)) as kk WHERE [i in kk where props[i] == \"Tim Duncan\"] RETURN v
You can specify a vertex property with {<prop_name>: <prop_value>}
after the tag in a pattern; or use a vertex property value to get vertices directly. Match a VID. MATCH (v) WHERE id(v) == 'player101' RETURN v
You can use the VID to match a vertex. The id()
function can retrieve the VID of a vertex. Match multiple VIDs. MATCH (v:player { name: 'Tim Duncan' })--(v2) WHERE id(v2) IN [\"player101\", \"player102\"] RETURN v2
To match multiple VIDs, use WHERE id(v) IN [vid_list]
. Match connected vertices MATCH (v:player{name:\"Tim Duncan\"})--(v2) RETURN v2.player.name AS Name
You can use the --
symbol to represent edges of both directions and match vertices connected by these edges. You can add a >
or <
to the --
symbol to specify the direction of an edge. Match paths MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) RETURN p
Connected vertices and edges form a path. You can use a user-defined variable to name a path as follows. Match edges MATCH (v:player{name:\"Tim Duncan\"})-[e]-(v2) RETURN e
MATCH ()<-[e]-() RETURN e
Besides using --
, -->
, or <--
to indicate a nameless edge, you can use a user-defined variable in a pair of square brackets to represent a named edge. For example: -[e]-
. Match an edge type MATCH ()-[e:follow]-() RETURN e
Just like vertices, you can specify an edge type with :<edge_type>
in a pattern. For example: -[e:follow]-
. Match edge type properties MATCH (v:player{name:\"Tim Duncan\"})-[e:follow{degree:95}]->(v2) RETURN e
MATCH ()-[e]->() WITH e, properties(e) as props, keys(properties(e)) as kk WHERE [i in kk where props[i] == 90] RETURN e
You can specify edge type properties with {<prop_name>: <prop_value>}
in a pattern. For example: [e:follow{likeness:95}]
; or use an edge type property value to get edges directly. Match multiple edge types MATCH (v:player{name:\"Tim Duncan\"})-[e:follow | :serve]->(v2) RETURN e
The |
symbol can help matching multiple edge types. For example: [e:follow|:serve]
. The English colon (:) before the first edge type cannot be omitted, but the English colon before the subsequent edge type can be omitted, such as [e:follow|serve]
. Match multiple edges MATCH (v:player{name:\"Tim Duncan\"})-[]->(v2)<-[e:serve]-(v3) RETURN v2, v3
You can extend a pattern to match multiple edges in a path. Match fixed-length paths MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) RETURN DISTINCT v2 AS Friends
You can use the :<edge_type>*<hop>
pattern to match a fixed-length path. hop
must be a non-negative integer. The data type of e
is the list. Match variable-length paths MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2) RETURN v2 AS Friends
minHop
: Optional. It represents the minimum length of the path. minHop
: must be a non-negative integer. The default value is 1.minHop
and maxHop
are optional and the default value is 1 and infinity respectively. The data type of e
is the list. Match variable-length paths with multiple edge types MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow | serve*2]->(v2) RETURN DISTINCT v2
You can specify multiple edge types in a fixed-length or variable-length pattern. In this case, hop
, minHop
, and maxHop
take effect on all edge types. The data type of e
is the list. Retrieve vertex or edge information MATCH (v:player{name:\"Tim Duncan\"}) RETURN v
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) RETURN e
Use RETURN {<vertex_name> | <edge_name>}
to retrieve all the information of a vertex or an edge. Retrieve VIDs MATCH (v:player{name:\"Tim Duncan\"}) RETURN id(v)
Use the id()
function to retrieve VIDs. Retrieve tags MATCH (v:player{name:\"Tim Duncan\"}) RETURN labels(v)
Use the labels()
function to retrieve the list of tags on a vertex.To retrieve the nth element in the labels(v)
list, use labels(v)[n-1]
. Retrieve a single property on a vertex or an edge MATCH (v:player{name:\"Tim Duncan\"}) RETURN v.player.age
Use RETURN {<vertex_name> | <edge_name>}.<property>
to retrieve a single property.Use AS
to specify an alias for a property. Retrieve all properties on a vertex or an edge MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN properties(v2)
Use the properties()
function to retrieve all properties on a vertex or an edge. Retrieve edge types MATCH p=(v:player{name:\"Tim Duncan\"})-[e]->() RETURN DISTINCT type(e)
Use the type()
function to retrieve the matched edge types. Retrieve paths MATCH p=(v:player{name:\"Tim Duncan\"})-[*3]->() RETURN p
Use RETURN <path_name>
to retrieve all the information of the matched paths. Retrieve vertices in a path MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN nodes(p)
Use the nodes()
function to retrieve all vertices in a path. Retrieve edges in a path MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) RETURN relationships(p)
Use the relationships()
function to retrieve all edges in a path. Retrieve path length MATCH p=(v:player{name:\"Tim Duncan\"})-[*..2]->(v2) RETURN p AS Paths, length(p) AS Length
Use the length()
function to retrieve the length of a path. OPTIONAL MATCH
Pattern Example Description Matches patterns against your graph database, just likeMATCH
does. MATCH (m)-[]->(n) WHERE id(m)==\"player100\" OPTIONAL MATCH (n)-[]->(l) RETURN id(m),id(n),id(l)
If no matches are found, OPTIONAL MATCH
will use a null for missing parts of the pattern. LOOKUP
LOOKUP ON {<vertex_tag> | <edge_type>} \n[WHERE <expression> [AND <expression> ...]] \nYIELD <return_list> [AS <alias>]\n
Pattern Example Description Retrieve vertices LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD player.name AS name, player.age AS age
The following example returns vertices whose name
is Tony Parker
and the tag is player
. Retrieve edges LOOKUP ON follow WHERE follow.degree == 90 YIELD follow.degree
Returns edges whose degree
is 90
and the edge type is follow
. List vertices with a tag LOOKUP ON player YIELD properties(vertex),id(vertex)
Shows how to retrieve the VID of all vertices tagged with player
. List edges with an edge types LOOKUP ON follow YIELD edge AS e
Shows how to retrieve the source Vertex IDs, destination vertex IDs, and ranks of all edges of the follow
edge type. Count the numbers of vertices or edges LOOKUP ON player YIELD id(vertex)| YIELD COUNT(*) AS Player_Count
Shows how to count the number of vertices tagged with player
. Count the numbers of edges LOOKUP ON follow YIELD edge as e| YIELD COUNT(*) AS Like_Count
Shows how to count the number of edges of the follow
edge type. GO
GO [[<M> TO] <N> {STEP|STEPS} ] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions> ]\nYIELD [DISTINCT] <return_list>\n[{SAMPLE <sample_list> | LIMIT <limit_list>}]\n[| GROUP BY {col_name | expr | position} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset_value>,] <number_rows>]\n
Example Description GO FROM \"player102\" OVER serve YIELD dst(edge)
Returns the teams that player 102 serves. GO 2 STEPS FROM \"player102\" OVER follow YIELD dst(edge)
Returns the friends of player 102 with 2 hops. GO FROM \"player100\", \"player102\" OVER serve WHERE properties(edge).start_year > 1995 YIELD DISTINCT properties($$).name AS team_name, properties(edge).start_year AS start_year, properties($^).name AS player_name
Adds a filter for the traversal. GO FROM \"player100\" OVER follow, serve YIELD properties(edge).degree, properties(edge).start_year
The following example traverses along with multiple edge types. If there is no value for a property, the output is NULL
. GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS destination
The following example returns the neighbor vertices in the incoming direction of player 100. GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id | GO FROM $-.id OVER serve WHERE properties($^).age > 20 YIELD properties($^).name AS FriendOf, properties($$).name AS Team
The following example retrieves the friends of player 100 and the teams that they serve. GO FROM \"player102\" OVER follow YIELD dst(edge) AS both
The following example returns all the neighbor vertices of player 102. GO 2 STEPS FROM \"player100\" OVER follow YIELD src(edge) AS src, dst(edge) AS dst, properties($$).age AS age | GROUP BY $-.dst YIELD $-.dst AS dst, collect_set($-.src) AS src, collect($-.age) AS age
The following example the outputs according to age. FETCH
Fetch vertex properties
FETCH PROP ON {<tag_name>[, tag_name ...] | *} \n<vid> [, vid ...] \nYIELD <return_list> [AS <alias>]\n
Example Description FETCH PROP ON player \"player100\" YIELD properties(vertex)
Specify a tag in the FETCH
statement to fetch the vertex properties by that tag. FETCH PROP ON player \"player100\" YIELD player.name AS name
Use a YIELD
clause to specify the properties to be returned. FETCH PROP ON player \"player101\", \"player102\", \"player103\" YIELD properties(vertex)
Specify multiple VIDs (vertex IDs) to fetch properties of multiple vertices. Separate the VIDs with commas. FETCH PROP ON player, t1 \"player100\", \"player103\" YIELD properties(vertex)
Specify multiple tags in the FETCH
statement to fetch the vertex properties by the tags. Separate the tags with commas. FETCH PROP ON * \"player100\", \"player106\", \"team200\" YIELD properties(vertex)
Set an asterisk symbol *
to fetch properties by all tags in the current graph space. Fetch edge properties
FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
Example Description FETCH PROP ON serve \"player100\" -> \"team204\" YIELD properties(edge)
The following statement fetches all the properties of the serve
edge that connects vertex \"player100\"
and vertex \"team204\"
. FETCH PROP ON serve \"player100\" -> \"team204\" YIELD serve.start_year
Use a YIELD
clause to fetch specific properties of an edge. FETCH PROP ON serve \"player100\" -> \"team204\", \"player133\" -> \"team202\" YIELD properties(edge)
Specify multiple edge patterns (<src_vid> -> <dst_vid>[@<rank>]
) to fetch properties of multiple edges. Separate the edge patterns with commas. FETCH PROP ON serve \"player100\" -> \"team204\"@1 YIELD properties(edge)
To fetch on an edge whose rank is not 0, set its rank in the FETCH statement. GO FROM \"player101\" OVER follow YIELD follow._src AS s, follow._dst AS d | FETCH PROP ON follow $-.s -> $-.d YIELD follow.degree
The following statement returns the degree
values of the follow
edges that start from vertex \"player101\"
. $var = GO FROM \"player101\" OVER follow YIELD follow._src AS s, follow._dst AS d; FETCH PROP ON follow $var.s -> $var.d YIELD follow.degree
You can use user-defined variables to construct similar queries. SHOW
Statement Syntax Example Description SHOW CHARSETSHOW CHARSET
SHOW CHARSET
Shows the available character sets. SHOW COLLATION SHOW COLLATION
SHOW COLLATION
Shows the collations supported by NebulaGraph. SHOW CREATE SPACE SHOW CREATE SPACE <space_name>
SHOW CREATE SPACE basketballplayer
Shows the creating statement of the specified graph space. SHOW CREATE TAG/EDGE SHOW CREATE {TAG <tag_name> | EDGE <edge_name>}
SHOW CREATE TAG player
Shows the basic information of the specified tag. SHOW HOSTS SHOW HOSTS [GRAPH | STORAGE | META]
SHOW HOSTS
SHOW HOSTS GRAPH
Shows the host and version information of Graph Service, Storage Service, and Meta Service. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the names of existing native indexes. SHOW PARTS SHOW PARTS [<part_id>]
SHOW PARTS
Shows the information of a specified partition or all partitions in a graph space. SHOW ROLES SHOW ROLES IN <space_name>
SHOW ROLES in basketballplayer
Shows the roles that are assigned to a user account. SHOW SNAPSHOTS SHOW SNAPSHOTS
SHOW SNAPSHOTS
Shows the information of all the snapshots. SHOW SPACES SHOW SPACES
SHOW SPACES
Shows existing graph spaces in NebulaGraph. SHOW STATS SHOW STATS
SHOW STATS
Shows the statistics of the graph space collected by the latest STATS
job. SHOW TAGS/EDGES SHOW TAGS | EDGES
SHOW TAGS
,SHOW EDGES
Shows all the tags in the current graph space. SHOW USERS SHOW USERS
SHOW USERS
Shows the user information. SHOW SESSIONS SHOW SESSIONS
SHOW SESSIONS
Shows the information of all the sessions. SHOW SESSIONS SHOW SESSION <Session_Id>
SHOW SESSION 1623304491050858
Shows a specified session with its ID. SHOW QUERIES SHOW [ALL] QUERIES
SHOW QUERIES
Shows the information of working queries in the current session. SHOW META LEADER SHOW META LEADER
SHOW META LEADER
Shows the information of the leader in the current Meta cluster. GROUP BY <var> YIELD <var>, <aggregation_function(var)>
GO FROM \"player100\" OVER follow BIDIRECT YIELD $$.player.name as Name | GROUP BY $-.Name YIELD $-.Name as Player, count(*) AS Name_Count
Finds all the vertices connected directly to vertex \"player100\"
, groups the result set by player names, and counts how many times the name shows up in the result set. LIMIT YIELD <var> [| LIMIT [<offset_value>,] <number_rows>]
GO FROM \"player100\" OVER follow REVERSELY YIELD $$.player.name AS Friend, $$.player.age AS Age | ORDER BY $-.Age, $-.Friend | LIMIT 1, 3
Returns the 3 rows of data starting from the second row of the sorted output. SKIP RETURN <var> [SKIP <offset>] [LIMIT <number_rows>]
MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) RETURN v2.player.name AS Name, v2.player.age AS Age ORDER BY Age DESC SKIP 1
SKIP
can be used alone to set the offset and return the data after the specified position. SAMPLE <go_statement> SAMPLE <sample_list>;
GO 3 STEPS FROM \"player100\" OVER * YIELD properties($$).name AS NAME, properties($$).age AS Age SAMPLE [1,2,3];
Takes samples evenly in the result set and returns the specified amount of data. ORDER BY <YIELD clause> ORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...]
FETCH PROP ON player \"player100\", \"player101\", \"player102\", \"player103\" YIELD player.age AS age, player.name AS name | ORDER BY $-.age ASC, $-.name DESC
The ORDER BY
clause specifies the order of the rows in the output. RETURN RETURN {<vertex_name>|<edge_name>|<vertex_name>.<property>|<edge_name>.<property>|...}
MATCH (v:player) RETURN v.player.name, v.player.age LIMIT 3
Returns the first three rows with values of the vertex properties name
and age
. TTL CREATE TAG <tag_name>(<property_name_1> <property_value_1>, <property_name_2> <property_value_2>, ...) ttl_duration= <value_int>, ttl_col = <property_name>
CREATE TAG t2(a int, b int, c string) ttl_duration= 100, ttl_col = \"a\"
Create a tag and set the TTL options. WHERE WHERE {<vertex|edge_alias>.<property_name> {>|==|<|...} <value>...}
MATCH (v:player) WHERE v.player.name == \"Tim Duncan\" XOR (v.player.age < 30 AND v.player.name == \"Yao Ming\") OR NOT (v.player.name == \"Yao Ming\" OR v.player.name == \"Tim Duncan\") RETURN v.player.name, v.player.age
The WHERE
clause filters the output by conditions. The WHERE
clause usually works in Native nGQL GO
and LOOKUP
statements, and OpenCypher MATCH
and WITH
statements. YIELD YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...] [WHERE <conditions>];
GO FROM \"player100\" OVER follow YIELD dst(edge) AS ID | FETCH PROP ON player $-.ID YIELD player.age AS Age | YIELD AVG($-.Age) as Avg_age, count(*)as Num_friends
Finds the players that \"player100\" follows and calculates their average age. WITH MATCH $expressions WITH {nodes()|labels()|...}
MATCH p=(v:player{name:\"Tim Duncan\"})--() WITH nodes(p) AS n UNWIND n AS n1 RETURN DISTINCT n1
The WITH
clause can retrieve the output from a query part, process it, and pass it to the next query part as the input. UNWIND UNWIND <list> AS <alias> <RETURN clause>
UNWIND [1,2,3] AS n RETURN n
Splits a list into rows."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#space_statements","title":"Space statements","text":"Statement Syntax Example Description CREATE SPACE CREATE SPACE [IF NOT EXISTS] <graph_space_name> ( [partition_num = <partition_number>,] [replica_factor = <replica_number>,] vid_type = {FIXED_STRING(<N>) | INT[64]} ) [COMMENT = '<comment>']
CREATE SPACE my_space_1 (vid_type=FIXED_STRING(30))
Creates a graph space with CREATE SPACE CREATE SPACE <new_graph_space_name> AS <old_graph_space_name>
CREATE SPACE my_space_4 as my_space_3
Clone a graph. space. USE USE <graph_space_name>
USE space1
Specifies a graph space as the current working graph space for subsequent queries. SHOW SPACES SHOW SPACES
SHOW SPACES
Lists all the graph spaces in the NebulaGraph examples. DESCRIBE SPACE DESC[RIBE] SPACE <graph_space_name>
DESCRIBE SPACE basketballplayer
Returns the information about the specified graph space. CLEAR SPACE CLEAR SPACE [IF EXISTS] <graph_space_name>
Deletes the vertices and edges in a graph space, but does not delete the graph space itself and the schema information. DROP SPACE DROP SPACE [IF EXISTS] <graph_space_name>
DROP SPACE basketballplayer
Deletes everything in the specified graph space."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#tag_statements","title":"TAG statements","text":"Statement Syntax Example Description CREATE TAG CREATE TAG [IF NOT EXISTS] <tag_name> ( <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'] [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] ) [TTL_DURATION = <ttl_duration>] [TTL_COL = <prop_name>] [COMMENT = '<comment>']
CREATE TAG woman(name string, age int, married bool, salary double, create_time timestamp) TTL_DURATION = 100, TTL_COL = \"create_time\"
Creates a tag with the given name in a graph space. DROP TAG DROP TAG [IF EXISTS] <tag_name>
DROP TAG test;
Drops a tag with the given name in the current working graph space. ALTER TAG ALTER TAG <tag_name> <alter_definition> [, alter_definition] ...] [ttl_definition [, ttl_definition] ... ] [COMMENT = '<comment>']
ALTER TAG t1 ADD (p3 int, p4 string)
Alters the structure of a tag with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration. SHOW TAGS SHOW TAGS
SHOW TAGS
Shows the name of all tags in the current graph space. DESCRIBE TAG DESC[RIBE] TAG <tag_name>
DESCRIBE TAG player
Returns the information about a tag with the given name in a graph space, such as field names, data type, and so on. DELETE TAG DELETE TAG <tag_name_list> FROM <VID>
DELETE TAG test1 FROM \"test\"
Deletes a tag with the given name on a specified vertex."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#edge_type_statements","title":"Edge type statements","text":"Statement Syntax Example Description CREATE EDGE CREATE EDGE [IF NOT EXISTS] <edge_type_name> ( <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'] [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] ) [TTL_DURATION = <ttl_duration>] [TTL_COL = <prop_name>] [COMMENT = '<comment>']
CREATE EDGE e1(p1 string, p2 int, p3 timestamp) TTL_DURATION = 100, TTL_COL = \"p2\"
Creates an edge type with the given name in a graph space. DROP EDGE DROP EDGE [IF EXISTS] <edge_type_name>
DROP EDGE e1
Drops an edge type with the given name in a graph space. ALTER EDGE ALTER EDGE <edge_type_name> <alter_definition> [, alter_definition] ...] [ttl_definition [, ttl_definition] ... ] [COMMENT = '<comment>']
ALTER EDGE e1 ADD (p3 int, p4 string)
Alters the structure of an edge type with the given name in a graph space. SHOW EDGES SHOW EDGES
SHOW EDGES
Shows all edge types in the current graph space. DESCRIBE EDGE DESC[RIBE] EDGE <edge_type_name>
DESCRIBE EDGE follow
Returns the information about an edge type with the given name in a graph space, such as field names, data type, and so on."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#vertex_statements","title":"Vertex statements","text":"Statement Syntax Example Description INSERT VERTEX INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...] VALUES <vid>: ([prop_value_list])
INSERT VERTEX t2 (name, age) VALUES \"13\":(\"n3\", 12), \"14\":(\"n4\", 8)
Inserts one or more vertices into a graph space in NebulaGraph. DELETE VERTEX DELETE VERTEX <vid> [, <vid> ...]
DELETE VERTEX \"team1\"
Deletes vertices and the related incoming and outgoing edges of the vertices. UPDATE VERTEX UPDATE VERTEX ON <tag_name> <vid> SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPDATE VERTEX ON player \"player101\" SET age = age + 2
Updates properties on tags of a vertex. UPSERT VERTEX UPSERT VERTEX ON <tag> <vid> SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPSERT VERTEX ON player \"player667\" SET age = 31
The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT VERTEX
to update the properties of a vertex if it exists or insert a new vertex if it does not exist."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#edge_statements","title":"Edge statements","text":"Statement Syntax Example Description INSERT EDGE INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ) [, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...]
INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1)
Inserts an edge or multiple edges into a graph space from a source vertex (given by src_vid) to a destination vertex (given by dst_vid) with a specific rank in NebulaGraph. DELETE EDGE DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid>[@<rank>] ...]
DELETE EDGE serve \"player100\" -> \"team204\"@0
Deletes one edge or multiple edges at a time. UPDATE EDGE UPDATE EDGE ON <edge_type> <src_vid> -> <dst_vid> [@<rank>] SET <update_prop> [WHEN <condition>] [YIELD <output>]
UPDATE EDGE ON serve \"player100\" -> \"team204\"@0 SET start_year = start_year + 1
Updates properties on an edge. UPSERT EDGE UPSERT EDGE ON <edge_type> <src_vid> -> <dst_vid> [@rank] SET <update_prop> [WHEN <condition>] [YIELD <properties>]
UPSERT EDGE on serve \"player666\" -> \"team200\"@0 SET end_year = 2021
The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT EDGE
to update the properties of an edge if it exists or insert a new edge if it does not exist."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#index","title":"Index","text":"Native index
You can use native indexes together with LOOKUP
and MATCH
statements.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT = '<comment>']
CREATE TAG INDEX player_index on player()
Add native indexes for the existing tags, edge types, or properties. SHOW CREATE INDEX SHOW CREATE {TAG | EDGE} INDEX <index_name>
show create tag index index_2
Shows the statement used when creating a tag or an edge type. It contains detailed information about the index, such as its associated properties. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the defined tag or edge type indexes names in the current graph space. DESCRIBE INDEX DESCRIBE {TAG | EDGE} INDEX <index_name>
DESCRIBE TAG INDEX player_index_0
Gets the information about the index with a given name, including the property name (Field) and the property type (Type) of the index. REBUILD INDEX REBUILD {TAG | EDGE} INDEX [<index_name_list>]
REBUILD TAG INDEX single_person_index
Rebuilds the created tag or edge type index. If data is updated or inserted before the creation of the index, you must rebuild the indexes manually to make sure that the indexes contain the previously added data. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Returns the name of the created tag or edge type index and its status. DROP INDEX DROP {TAG | EDGE} INDEX [IF EXISTS] <index_name>
DROP TAG INDEX player_index_0
Removes an existing index from the current graph space. Full-text index
Syntax Example DescriptionSIGN IN TEXT SERVICE [(<elastic_ip:port> [,<username>, <password>]), (<elastic_ip:port>), ...]
SIGN IN TEXT SERVICE (127.0.0.1:9200)
The full-text indexes is implemented based on Elasticsearch. After deploying an Elasticsearch cluster, you can use the SIGN IN
statement to log in to the Elasticsearch client. SHOW TEXT SEARCH CLIENTS
SHOW TEXT SEARCH CLIENTS
Shows text search clients. SIGN OUT TEXT SERVICE
SIGN OUT TEXT SERVICE
Signs out to the text search clients. CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER=\"<analyzer_name>\"]
CREATE FULLTEXT TAG INDEX nebula_index_1 ON player(name)
Creates full-text indexes. SHOW FULLTEXT INDEXES
SHOW FULLTEXT INDEXES
Show full-text indexes. REBUILD FULLTEXT INDEX
REBUILD FULLTEXT INDEX
Rebuild full-text indexes. DROP FULLTEXT INDEX <index_name>
DROP FULLTEXT INDEX nebula_index_1
Drop full-text indexes. LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, \"<text>\") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>]
LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Chris\") YIELD id(vertex)
Use query options. GET SUBGRAPH [WITH PROP] [<step_count> {STEP|STEPS}] FROM {<vid>, <vid>...} [{IN | OUT | BOTH} <edge_type>, <edge_type>...] YIELD [VERTICES AS <vertex_alias>] [,EDGES AS <edge_alias>]
GET SUBGRAPH 1 STEPS FROM \"player100\" YIELD VERTICES AS nodes, EDGES AS relationships
Retrieves information of vertices and edges reachable from the source vertices of the specified edge types and returns information of the subgraph. FIND PATH FIND { SHORTEST | ALL | NOLOOP } PATH [WITH PROP] FROM <vertex_id_list> TO <vertex_id_list> OVER <edge_type_list> [REVERSELY | BIDIRECT] [<WHERE clause>] [UPTO <N> {STEP|STEPS}] YIELD path as <alias> [| ORDER BY $-.path] [| LIMIT <M>]
FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path as p
Finds the paths between the selected source vertices and destination vertices. A returned path is like (<vertex_id>)-[:<edge_type_name>@<rank>]->(<vertex_id)
."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#query_tuning_statements","title":"Query tuning statements","text":"Type Syntax Example Description EXPLAIN EXPLAIN [format=\"row\" | \"dot\"] <your_nGQL_statement>
EXPLAIN format=\"row\" SHOW TAGS
EXPLAIN format=\"dot\" SHOW TAGS
Helps output the execution plan of an nGQL statement without executing the statement. PROFILE PROFILE [format=\"row\" | \"dot\"] <your_nGQL_statement>
PROFILE format=\"row\" SHOW TAGS
EXPLAIN format=\"dot\" SHOW TAGS
Executes the statement, then outputs the execution plan as well as the execution profile."},{"location":"2.quick-start/6.cheatsheet-for-ngql/#operation_and_maintenance_statements","title":"Operation and maintenance statements","text":"SUBMIT JOB BALANCE
Syntax DescriptionBALANCE LEADER
Starts a job to balance the distribution of all the storage leaders in graph spaces. It returns the job ID. Job statements
Syntax DescriptionSUBMIT JOB COMPACT
Triggers the long-term RocksDB compact
operation. SUBMIT JOB FLUSH
Writes the RocksDB memfile in the memory to the hard disk. SUBMIT JOB STATS
Starts a job that makes the statistics of the current graph space. Once this job succeeds, you can use the SHOW STATS
statement to list the statistics. SHOW JOB <job_id>
Shows the information about a specific job and all its tasks in the current graph space. The Meta Service parses a SUBMIT JOB
request into multiple tasks and assigns them to the nebula-storaged processes. SHOW JOBS
Lists all the unexpired jobs in the current graph space. STOP JOB
Stops jobs that are not finished in the current graph space. RECOVER JOB
Re-executes the failed jobs in the current graph space and returns the number of recovered jobs. Kill queries
Syntax Example DescriptionKILL QUERY (session=<session_id>, plan=<plan_id>)
KILL QUERY(SESSION=1625553545984255,PLAN=163)
Terminates the query being executed, and is often used to terminate slow queries. Kill sessions
Syntax Example DescriptionKILL {SESSION|SESSIONS} <SessionId>
KILL SESSION 1672887983842984
Terminates a single session. SHOW SESSIONS | YIELD $-.SessionId AS sid [WHERE <filter_clause>] | KILL {SESSION|SESSIONS} $-.sid
SHOW SESSIONS | YIELD $-.SessionId AS sid, $-.CreateTime as CreateTime | ORDER BY $-.CreateTime ASC | LIMIT 2 | KILL SESSIONS $-.sid
Terminates multiple sessions based on specified criteria. SHOW SESSIONS | KILL SESSIONS $-.SessionId
SHOW SESSIONS | KILL SESSIONS $-.SessionId
Terminates all sessions. This topic lists the frequently asked questions for using NebulaGraph master. You can use the search box in the help center or the search function of the browser to match the questions you are looking for.
If the solutions described in this topic cannot solve your problems, ask for help on the NebulaGraph forum or submit an issue on GitHub issue.
"},{"location":"20.appendix/0.FAQ/#about_manual_updates","title":"About manual updates","text":""},{"location":"20.appendix/0.FAQ/#why_is_the_behavior_in_the_manual_not_consistent_with_the_system","title":"\"Why is the behavior in the manual not consistent with the system?\"","text":"NebulaGraph is still under development. Its behavior changes from time to time. Users can submit an issue to inform the team if the manual and the system are not consistent.
Note
If you find some errors in this topic:
pencil
button at the top right side of this page.Compatibility
Neubla Graph master is not compatible with NebulaGraph 1.x nor 2.0-RC in both data formats and RPC-protocols, and vice versa. The service process may quit if using an lower version client to connect to a higher version server.
To upgrade data formats, see Upgrade NebulaGraph to the current version. Users must upgrade all clients.
"},{"location":"20.appendix/0.FAQ/#about_execution_errors","title":"About execution errors","text":""},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_-1005graphmemoryexceeded_-2600","title":"\"How to resolve the error-1005:GraphMemoryExceeded: (-2600)
?\"","text":"This error is issued by the Memory Tracker when it observes that memory usage has exceeded a set threshold. This mechanism can help avoid service processes from being terminated by the system's OOM (Out of Memory) killer. Steps to resolve:
Check memory usage: First, you need to check the memory usage during the execution of the command. If the memory usage is indeed high, then this error might be expected.
Check the configuration of the Memory Tracker: If the memory usage is not high, check the relevant configurations of the Memory Tracker. These include memory_tracker_untracked_reserved_memory_mb
(untracked reserved memory in MB), memory_tracker_limit_ratio
(memory limit ratio), and memory_purge_enabled
(whether memory purge is enabled). For the configuration of the Memory Tracker, see memory tracker configuration.
Optimize configurations: Adjust these configurations according to the actual situation. For example, if the available memory limit is too low, you can increase the value of memory_tracker_limit_ratio
.
SemanticError: Missing yield clause.
?\"","text":"Starting with NebulaGraph 3.0.0, the statements LOOKUP
, GO
, and FETCH
must output results with the YIELD
clause. For more information, see YIELD.
Host not enough!
?\"","text":"From NebulaGraph version 3.0.0, the Storage services added in the configuration files CANNOT be read or written directly. The configuration files only register the Storage services into the Meta services. You must run the ADD HOSTS
command to read and write data on Storage servers. For more information, see Manage Storage hosts.
To get the property of the vertex in 'v.age', should use the format 'var.tag.prop'
?\"","text":"From NebulaGraph version 3.0.0, patterns support matching multiple tags at the same time, so you need to specify a tag name when querying properties. The original statement RETURN variable_name.property_name
is changed to RETURN variable_name.<tag_name>.property_name
.
Used memory hits the high watermark(0.800000) of total system memory.
?\"","text":"The error may be caused if the system memory usage is higher than the threshold specified bysystem_memory_high_watermark_ratio
, which defaults to 0.8
. When the threshold is exceeded, an alarm is triggered and NebulaGraph stops processing queries.
Possible solutions are as follows:
system_memory_high_watermark_ratio
parameter to the configuration files of all Graph servers, and set it greater than 0.8
, such as 0.9
.However, the system_memory_high_watermark_ratio
parameter is deprecated. It is recommended that you use the Memory Tracker feature instead to limit the memory usage of Graph and Storage services. For more information, see Memory Tracker for Graph service and Memory Tracker for Storage service.
Storage Error E_RPC_FAILURE
?\"","text":"The reason for this error is usually that the storaged process returns too many data back to the graphd process. Possible solutions are as follows:
--storage_client_timeout_ms
in the nebula-graphd.conf
file to extend the connection timeout of the Storage client. This configuration is measured in milliseconds (ms). For example, set --storage_client_timeout_ms=60000
. If this parameter is not specified in the nebula-graphd.conf
file, specify it manually. Tip: Add --local_config=true
at the beginning of the configuration file and restart the service.LIMIT
is used to limit the number of returned results, use the GO
statement to rewrite the MATCH
statement (the former is optimized, while the latter is not).dmesg |grep nebula
).The leader has changed. Try again later
?\"","text":"It is a known issue. Just retry 1 to N times, where N is the partition number. The reason is that the meta client needs some heartbeats to update or errors to trigger the new leader information.
If this error occurs when logging in to NebulaGraph, you can consider using df -h
to view the disk space and check whether the local disk is full.
Schema not exist: xxx
?\"","text":"If the system returns Schema not exist
when querying, make sure that:
Problem description: The system reports Could not find artifact com.vesoft:client:jar:xxx-SNAPSHOT
when compiling.
Cause: There is no local Maven repository for storing or downloading SNAPSHOT packages. The default central repository in Maven only stores official releases, not development versions (SNAPSHOTs).
Solution: Add the following configuration in the profiles
scope of Maven's setting.xml
file:
<profile>\n <activation>\n <activeByDefault>true</activeByDefault>\n </activation>\n <repositories>\n <repository>\n <id>snapshots</id>\n <url>https://oss.sonatype.org/content/repositories/snapshots/</url>\n <snapshots>\n <enabled>true</enabled>\n </snapshots>\n </repository>\n </repositories>\n </profile>\n
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_error_-1004_syntaxerror_syntax_error_near","title":"\"How to resolve [ERROR (-1004)]: SyntaxError: syntax error near
?\"","text":"In most cases, a query statement requires a YIELD
or a RETURN
. Check your query statement to see if YIELD
or RETURN
is provided.
can\u2019t solve the start vids from the sentence
?\"","text":"The graphd process requires start vids
to begin a graph traversal. The start vids
can be specified by the user. For example:
> GO FROM ${vids} ...\n> MATCH (src) WHERE id(src) == ${vids}\n# The \"start vids\" are explicitly given by ${vids}.\n
It can also be found from a property index. For example:
# CREATE TAG INDEX IF NOT EXISTS i_player ON player(name(20));\n# REBUILD TAG INDEX i_player;\n\n> LOOKUP ON player WHERE player.name == \"abc\" | ... YIELD ...\n> MATCH (src) WHERE src.name == \"abc\" ...\n# The \"start vids\" are found from the property index \"name\".\n
Otherwise, an error like can\u2019t solve the start vids from the sentence
will be returned.
Wrong vertex id type: 1001
?\"","text":"Check whether the VID is INT64
or FIXED_STRING(N)
set by create space
. For more information, see create space.
The VID must be a 64-bit integer or a string fitting space vertex id length limit.
?\"","text":"Check whether the length of the VID exceeds the limitation. For more information, see create space.
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_edge_conflict_or_vertex_conflict","title":"\"How to resolve the erroredge conflict
or vertex conflict
?\"","text":"NebulaGraph may return such errors when the Storage service receives multiple requests to insert or update the same vertex or edge within milliseconds. Try the failed requests again later.
"},{"location":"20.appendix/0.FAQ/#how_to_resolve_the_error_rpc_failure_in_metaclient_connection_refused","title":"\"How to resolve the errorRPC failure in MetaClient: Connection refused
?\"","text":"The reason for this error is usually that the metad service status is unusual, or the network of the machine where the metad and graphd services are located is disconnected. Possible solutions are as follows:
telnet meta-ip:port
to check the network status under the server that returns an error.StorageClientBase.inl:214] Request to \"x.x.x.x\":9779 failed: N6apache6thrift9transport19TTransportExceptionE: Timed Out
in nebula-graph.INFO
?\"","text":"The reason for this error may be that the amount of data to be queried is too large, and the storaged process has timed out. Possible solutions are as follows:
--storage_client_timeout_ms
in the nebula-graphd.conf
file. This configuration is measured in milliseconds (ms). The default value is 60000ms.MetaClient.cpp:65] Heartbeat failed, status:Wrong cluster!
in nebula-storaged.INFO
, or HBProcessor.cpp:54] Reject wrong cluster host \"x.x.x.x\":9771!
in nebula-metad.INFO
?\"","text":"The reason for this error may be that the user has modified the IP or the port information of the metad process, or the storage service has joined other clusters before. Possible solutions are as follows:
Delete the cluster.id
file in the installation directory where the storage machine is deployed (the default installation directory is /usr/local/nebula
), and restart the storaged service.
Storage Error: More than one request trying to add/update/delete one edge/vertex at he same time.
?\"","text":"The reason for this error is that the current NebulaGraph version does not support concurrent requests to the same vertex or edge at the same time. To solve this error, re-execute your commands.
"},{"location":"20.appendix/0.FAQ/#about_design_and_functions","title":"About design and functions","text":""},{"location":"20.appendix/0.FAQ/#how_is_the_time_spent_value_at_the_end_of_each_return_message_calculated","title":"\"How is thetime spent
value at the end of each return message calculated?\"","text":"Take the returned message of SHOW SPACES
as an example:
nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"basketballplayer\" |\n+--------------------+\nGot 1 rows (time spent 1235/1934 us)\n
1235
shows the time spent by the database itself, that is, the time it takes for the query engine to receive a query from the client, fetch the data from the storage server, and perform a series of calculations.1934
shows the time spent from the client's perspective, that is, the time it takes for the client from sending a request, receiving a response, and displaying the result on the screen.nebula-storaged
process keep showing red after connecting to NebulaGraph?\"","text":"Because the nebula-storaged
process waits for nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
This is caused by the release of NebulaGraph Console 2.6.0, not the change of NebulaGraph core. And it will not affect the content of the returned data itself.
"},{"location":"20.appendix/0.FAQ/#about_dangling_edges","title":"About dangling edges","text":"A dangling edge is an edge that only connects to a single vertex and only one part of the edge connects to the vertex.
Dangling edges may appear in NebulaGraph master as the design. And there is no MERGE
statements of openCypher. The guarantee for dangling edges depends entirely on the application level. For more information, see INSERT VERTEX, DELETE VERTEX, INSERT EDGE, DELETE EDGE.
replica_factor
as an even number in CREATE SPACE
statements, e.g., replica_factor = 2
?\"","text":"NO.
The Storage service guarantees its availability based on the Raft consensus protocol. The number of failed replicas must not exceed half of the total replica number.
When the number of machines is 1, replica_factor
can only be set to1
.
When there are enough machines and replica_factor=2
, if one replica fails, the Storage service fails. No matter replica_factor=3
or replica_factor=4
, if more than one replica fails, the Storage Service fails. To prevent unnecessary waste of resources, we recommend that you set an odd replica number.
We suggest that you set replica_factor=3
for a production environment and replica_factor=1
for a test environment. Do not use an even number.
Yes. For more information, see Kill query.
"},{"location":"20.appendix/0.FAQ/#why_are_the_query_results_different_when_using_go_and_match_to_execute_the_same_semantic_query","title":"\"Why are the query results different when usingGO
and MATCH
to execute the same semantic query?\"","text":"The possible reasons are listed as follows.
GO
statements find the dangling edges.RETURN
commands do not specify the sequence.max_edge_returned_per_vertex
in the Storage service is triggered.Using different types of paths may cause different query results.
GO
statements use walk
. Both vertices and edges can be repeatedly visited in graph traversal.MATCH
statements are compatible with openCypher and use trail
. Only vertices can be repeatedly visited in graph traversal.The example is as follows.
All queries that start from A
with 5 hops will end at C
(A->B->C->D->E->C
). If it is 6 hops, the GO
statement will end at D
(A->B->C->D->E->C->D
), because the edge C->D
can be visited repeatedly. However, the MATCH
statement returns empty, because edges cannot be visited repeatedly.
Therefore, using GO
and MATCH
to execute the same semantic query may cause different query results.
For more information, see Wikipedia.
"},{"location":"20.appendix/0.FAQ/#how_to_count_the_verticesedges_number_of_each_tagedge_type","title":"\"How to count the vertices/edges number of each tag/edge type?\"","text":"See show-stats.
"},{"location":"20.appendix/0.FAQ/#how_to_get_all_the_verticesedge_of_each_tagedge_type","title":"\"How to get all the vertices/edge of each tag/edge type?\"","text":"Create and rebuild the index.
> CREATE TAG INDEX IF NOT EXISTS i_player ON player();\n> REBUILD TAG INDEX IF NOT EXISTS i_player;\n
Use LOOKUP
or MATCH
. For example:
> LOOKUP ON player;\n> MATCH (n:player) RETURN n;\n
For more information, see INDEX
, LOOKUP
, and MATCH
.
Yes, for more information, see Keywords and reserved words.
"},{"location":"20.appendix/0.FAQ/#how_to_get_the_out-degreethe_in-degree_of_a_given_vertex","title":"\"How to get the out-degree/the in-degree of a given vertex?\"","text":"The out-degree of a vertex refers to the number of edges starting from that vertex, while the in-degree refers to the number of edges pointing to that vertex.
nebula > MATCH (s)-[e]->() WHERE id(s) == \"given\" RETURN count(e); #Out-degree\nnebula > MATCH (s)<-[e]-() WHERE id(s) == \"given\" RETURN count(e); #In-degree\n
This is a very slow operation to get the out/in degree since no accelaration can be applied (no indices or caches). It also could be out-of-memory when hitting a supper-node.
"},{"location":"20.appendix/0.FAQ/#how_to_quickly_get_the_out-degree_and_in-degree_of_all_vertices","title":"\"How to quickly get the out-degree and in-degree of all vertices?\"","text":"There is no such command.
You can use NebulaGraph Algorithm.
"},{"location":"20.appendix/0.FAQ/#about_operation_and_maintenance","title":"About operation and maintenance","text":""},{"location":"20.appendix/0.FAQ/#the_runtime_log_files_are_too_large_how_to_recycle_the_logs","title":"\"The runtime log files are too large. How to recycle the logs?\"","text":"NebulaGraph uses glog for log printing, which does not support log recycling. You can manage runtime logs by using cron jobs or the log management tool logrotate. For operational details, see Log recycling.
"},{"location":"20.appendix/0.FAQ/#how_to_check_the_nebulagraph_version","title":"\"How to check the NebulaGraph version?\"","text":"If the service is running: run command SHOW HOSTS META
in nebula-console
. See SHOW HOSTS.
If the service is not running:
Different installation methods make the method of checking the version different. The instructions are as follows:
If the service is not running, run the command ./<binary_name> --version
to get the version and the Git commit IDs of the NebulaGraph binary files. For example:
$ ./nebula-graphd --version\n
If you deploy NebulaGraph with Docker Compose
Check the version of NebulaGraph deployed by Docker Compose. The method is similar to the previous method, except that you have to enter the container first. The commands are as follows:
docker exec -it nebula-docker-compose_graphd_1 bash\ncd bin/\n./nebula-graphd --version\n
If you install NebulaGraph with RPM/DEB package
Run rpm -qa |grep nebula
to check the version of NebulaGraph.
Warning
The cluster scaling function has not been officially released in the community edition. The operations involving SUBMIT JOB BALANCE DATA REMOVE
and SUBMIT JOB BALANCE DATA
are experimental features in the community edition and the functionality is not stable. Before using it in the community edition, make sure to back up your data first and set enable_experimental_feature
and enable_data_balance
to true
in the Graph configuration file.
NebulaGraph master does not provide any commands or tools to support automatic scale out/in. You can refer to the following steps:
Scale out and scale in metad: The metad process can not be scaled out or scale in. The process cannot be moved to a new machine. You cannot add a new metad process to the service.
Note
You can use the Meta transfer script tool to migrate Meta services. Note that the Meta-related settings in the configuration files of Storage and Graph services need to be modified correspondingly.
Scale in storaged: See Balance remove command. After the command is finished, stop this storaged process.
Caution
Scale out storaged: Prepare the binary and config files of the storaged process in the new host, modify the config files and add all existing addresses of the metad processes. Then register the storaged process to the metad, and then start the new storaged process. For details, see Register storaged services.
You also need to run Balance Data and Balance leader after scaling in/out storaged.
Currently, Storage cannot dynamically recognize new added disks. You can add or remove disks in the Storage nodes of the distributed cluster by following these steps:
Execute SUBMIT JOB BALANCE DATA REMOVE <ip:port>
to migrate data in the Storage node with the disk to be added or removed to other Storage nodes.
Caution
Execute DROP HOSTS <ip:port>
to remove the Storage node with the disk to be added or removed.
In the configuration file of all Storage nodes, configure the path of the new disk to be added or removed through --data_path
, see Storage configuration file for details.
ADD HOSTS <ip:port>
to re-add the Storage node with the disk to be added or removed.SUBMIT JOB BALANCE DATA
to evenly distribute the shards of the current space to all Storage nodes and execute SUBMIT JOB BALANCE LEADER
command to balance the leaders in all spaces. Before running the command, select a space.OFFLINE
. What should I do?\"","text":"Hosts with the status of OFFLINE
will be automatically deleted after one day.
The dmp file is an error report file detailing the exit of the process and can be viewed with the gdb utility. the Coredump file is saved in the directory of the startup binary (by default it is /usr/local/nebula
) and is generated automatically when the NebulaGraph service crashes.
$ file core.<pid>\n
$ gdb <process.name> core.<pid>\n
$(gdb) bt\n
For example:
$ file core.1316027\ncore.1316027: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/home/workspace/fork/nebula-debug/bin/nebula-metad --flagfile /home/k', real uid: 1008, effective uid: 1008, real gid: 1008, effective gid: 1008, execfn: '/home/workspace/fork/nebula-debug/bin/nebula-metad', platform: 'x86_64'\n\n$ gdb /home/workspace/fork/nebula-debug/bin/nebula-metad core.1316027\n\n$(gdb) bt\n#0 0x00007f9de58fecf5 in __memcpy_ssse3_back () from /lib64/libc.so.6\n#1 0x0000000000eb2299 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) ()\n#2 0x0000000000ef71a7 in nebula::meta::cpp2::QueryDesc::QueryDesc(nebula::meta::cpp2::QueryDesc const&) ()\n...\n
If you are not clear about the information that dmp prints out, you can post the printout with the OS version, hardware configuration, error logs before and after the Core file was created and actions that may have caused the error on the NebulaGraph forum.
"},{"location":"20.appendix/0.FAQ/#how_can_i_set_the_nebulagraph_service_to_start_automatically_on_boot_via_systemctl","title":"How can I set the NebulaGraph service to start automatically on boot via systemctl?","text":"Execute systemctl enable
to start the metad, graphd and storaged services.
[root]# systemctl enable nebula-metad.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-metad.service to /usr/lib/systemd/system/nebula-metad.service.\n[root]# systemctl enable nebula-graphd.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-graphd.service to /usr/lib/systemd/system/nebula-graphd.service.\n[root]# systemctl enable nebula-storaged.service\nCreated symlink from /etc/systemd/system/multi-user.target.wants/nebula-storaged.service to /usr/lib/systemd/system/nebula-storaged.service.\n
Configure the service files for metad, graphd and storaged to set the service to pull up automatically.
Caution
The following points need to be noted when configuring the service file. - The paths of the PIDFile, ExecStart, ExecReload and ExecStop parameters need to be the same as those on the server. - RestartSec is the length of time (in seconds) to wait before restarting, which can be modified according to the actual situation. - (Optional) StartLimitInterval is the unlimited restart, the default is 10 seconds if the restart exceeds 5 times, and set to 0 means unlimited restart. - (Optional) LimitNOFILE is the maximum number of open files for the service, the default is 1024 and can be changed according to the actual situation.
Configure the service file for the metad service.
$ vi /usr/lib/systemd/system/nebula-metad.service\n\n[Unit]\nDescription=Nebula Graph Metad Service\nAfter=network.target\n\n[Service ]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-metad.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start metad\nExecReload=/usr/local/nebula/scripts/nebula.service restart metad\nExecStop=/usr/local/nebula/scripts/nebula.service stop metad\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Configure the service file for the graphd service.
$ vi /usr/lib/systemd/system/nebula-graphd.service\n[Unit]\nDescription=Nebula Graph Graphd Service\nAfter=network.target\n\n[Service]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-graphd.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start graphd\nExecReload=/usr/local/nebula/scripts/nebula.service restart graphd\nExecStop=/usr/local/nebula/scripts/nebula.service stop graphd\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Configure the service file for the storaged service. $ vi /usr/lib/systemd/system/nebula-storaged.service\n[Unit]\nDescription=Nebula Graph Storaged Service\nAfter=network.target\n\n[Service]\nType=forking\nRestart=always\nRestartSec=15s\nPIDFile=/usr/local/nebula/pids/nebula-storaged.pid\nExecStart=/usr/local/nebula/scripts/nebula.service start storaged\nExecReload=/usr/local/nebula/scripts/nebula.service restart storaged\nExecStop=/usr/local/nebula/scripts/nebula.service stop storaged\nPrivateTmp=true\nStartLimitInterval=0\nLimitNOFILE=1024\n\n[Install]\nWantedBy=multi-user.target\n
Reload the configuration file.
[root]# sudo systemctl daemon-reload\n
Restart the service.
$ systemctl restart nebula-metad.service\n$ systemctl restart nebula-graphd.service\n$ systemctl restart nebula-storaged.service\n
If you have not modified the predefined ports in the Configurations, open the following ports for the NebulaGraph services:
Service Port Meta 9559, 9560, 19559 Graph 9669, 19669 Storage 9777 ~ 9780, 19779If you have customized the configuration files and changed the predefined ports, find the port numbers in your configuration files and open them on the firewalls.
For more port information, see Port Guide for Company Products.
"},{"location":"20.appendix/0.FAQ/#how_to_test_whether_a_port_is_open_or_closed","title":"\"How to test whether a port is open or closed?\"","text":"You can use telnet as follows to check for port status.
telnet <ip> <port>\n
Note
If you cannot use the telnet command, check if telnet is installed or enabled on your host.
For example:
// If the port is open:\n$ telnet 192.168.1.10 9669\nTrying 192.168.1.10...\nConnected to 192.168.1.10.\nEscape character is '^]'.\n\n// If the port is closed or blocked:\n$ telnet 192.168.1.10 9777\nTrying 192.168.1.10...\ntelnet: connect to address 192.168.1.10: Connection refused\n
"},{"location":"20.appendix/6.eco-tool-version/","title":"Ecosystem tools overview","text":""},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_studio","title":"NebulaGraph Studio","text":"NebulaGraph Studio (Studio for short) is a graph database visualization tool that can be accessed through the Web. It can be used with NebulaGraph DBMS to provide one-stop services such as composition, data import, writing nGQL queries, and graph exploration. For details, see What is NebulaGraph Studio.
Note
The release of the Studio is independent of NebulaGraph core, and its naming method is also not the same as the core naming rules.
NebulaGraph version Studio version master v3.9.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_dashboard_community_edition","title":"NebulaGraph Dashboard Community Edition","text":"NebulaGraph Dashboard Community Edition (Dashboard for short) is a visualization tool for monitoring the status of machines and services in the NebulaGraph cluster. For details, see What is NebulaGraph Dashboard.
NebulaGraph version Dashboard Community version master v3.4.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_exchange","title":"NebulaGraph Exchange","text":"NebulaGraph Exchange (Exchange for short) is an Apache Spark&trade application for batch migration of data in a cluster to NebulaGraph in a distributed environment. It can support the migration of batch data and streaming data in a variety of different formats. For details, see What is NebulaGraph Exchange.
NebulaGraph version Exchange Community version master v3.7.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_operator","title":"NebulaGraph Operator","text":"NebulaGraph Operator (Operator for short) is a tool to automate the deployment, operation, and maintenance of NebulaGraph clusters on Kubernetes. Building upon the excellent scalability mechanism of Kubernetes, NebulaGraph introduced its operation and maintenance knowledge into the Kubernetes system, which makes NebulaGraph a real cloud-native graph database. For more information, see What is NebulaGraph Operator.
NebulaGraph version Operator version master v1.8.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_importer","title":"NebulaGraph Importer","text":"NebulaGraph Importer (Importer for short) is a CSV file import tool for NebulaGraph. The Importer can read the local CSV file, and then import the data into the NebulaGraph database. For details, see What is NebulaGraph Importer.
NebulaGraph version Importer version master v4.1.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_spark_connector","title":"NebulaGraph Spark Connector","text":"NebulaGraph Spark Connector is a Spark connector that provides the ability to read and write NebulaGraph data in the Spark standard format. NebulaGraph Spark Connector consists of two parts, Reader and Writer. For details, see What is NebulaGraph Spark Connector.
NebulaGraph version Spark Connector version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_flink_connector","title":"NebulaGraph Flink Connector","text":"NebulaGraph Flink Connector is a connector that helps Flink users quickly access NebulaGraph. It supports reading data from the NebulaGraph database or writing data read from other external data sources to the NebulaGraph database. For details, see What is NebulaGraph Flink Connector.
NebulaGraph version Flink Connector version master v3.5.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_algorithm","title":"NebulaGraph Algorithm","text":"NebulaGraph Algorithm (Algorithm for short) is a Spark application based on GraphX, which uses a complete algorithm tool to analyze data in the NebulaGraph database by submitting a Spark task To perform graph computing, use the algorithm under the lib repository through programming to perform graph computing for DataFrame. For details, see What is NebulaGraph Algorithm.
NebulaGraph version Algorithm version master v3.0.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_console","title":"NebulaGraph Console","text":"NebulaGraph Console is the native CLI client of NebulaGraph. For how to use it, see NebulaGraph Console.
NebulaGraph version Console version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_docker_compose","title":"NebulaGraph Docker Compose","text":"Docker Compose can quickly deploy NebulaGraph clusters. For how to use it, please refer to Docker Compose Deployment NebulaGraph.
NebulaGraph version Docker Compose version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#backup_restore","title":"Backup & Restore","text":"Backup&Restore (BR for short) is a command line interface (CLI) tool that can help back up the graph space data of NebulaGraph, or restore it through a backup file data.
NebulaGraph version BR version master v3.6.0"},{"location":"20.appendix/6.eco-tool-version/#nebulagraph_bench","title":"NebulaGraph Bench","text":"NebulaGraph Bench is used to test the baseline performance data of NebulaGraph. It uses the standard data set of LDBC.
NebulaGraph version Bench version master v1.2.0"},{"location":"20.appendix/6.eco-tool-version/#api_and_sdk","title":"API and SDK","text":"Compatibility
Select the latest version of X.Y.*
which is the same as the core version.
The following are useful utilities and tools contributed and maintained by community users.
NebulaGraph returns an error code when an error occurs. This topic describes the details of the error code returned.
Note
0
, it means that the operation is successful.E_DISCONNECTED
-1
Lost connection E_FAIL_TO_CONNECT
-2
Unable to establish connection E_RPC_FAILURE
-3
RPC failure E_LEADER_CHANGED
-4
Raft leader has been changed E_SPACE_NOT_FOUND
-5
Graph space does not exist E_TAG_NOT_FOUND
-6
Tag does not exist E_EDGE_NOT_FOUND
-7
Edge type does not exist E_INDEX_NOT_FOUND
-8
Index does not exist E_EDGE_PROP_NOT_FOUND
-9
Edge type property does not exist E_TAG_PROP_NOT_FOUND
-10
Tag property does not exist E_ROLE_NOT_FOUND
-11
The current role does not exist E_CONFIG_NOT_FOUND
-12
The current configuration does not exist E_MACHINE_NOT_FOUND
-13
The current host does not exist E_LISTENER_NOT_FOUND
-15
Listener does not exist E_PART_NOT_FOUND
-16
The current partition does not exist E_KEY_NOT_FOUND
-17
Key does not exist E_USER_NOT_FOUND
-18
User does not exist E_STATS_NOT_FOUND
-19
Statistics do not exist E_SERVICE_NOT_FOUND
-20
No current service found E_DRAINER_NOT_FOUND
-21
Drainer does not exist E_DRAINER_CLIENT_NOT_FOUND
-22
Drainer client does not exist E_PART_STOPPED
-23
The current partition has already been stopped E_BACKUP_FAILED
-24
Backup failed E_BACKUP_EMPTY_TABLE
-25
The backed-up table is empty E_BACKUP_TABLE_FAILED
-26
Table backup failure E_PARTIAL_RESULT
-27
MultiGet could not get all data E_REBUILD_INDEX_FAILED
-28
Index rebuild failed E_INVALID_PASSWORD
-29
Password is invalid E_FAILED_GET_ABS_PATH
-30
Unable to get absolute path E_BAD_USERNAME_PASSWORD
-1001
Authentication failed E_SESSION_INVALID
-1002
Invalid session E_SESSION_TIMEOUT
-1003
Session timeout E_SYNTAX_ERROR
-1004
Syntax error E_EXECUTION_ERROR
-1005
Execution error E_STATEMENT_EMPTY
-1006
Statement is empty E_BAD_PERMISSION
-1008
Permission denied E_SEMANTIC_ERROR
-1009
Semantic error E_TOO_MANY_CONNECTIONS
-1010
Maximum number of connections exceeded E_PARTIAL_SUCCEEDED
-1011
Access to storage failed (only some requests succeeded) E_NO_HOSTS
-2001
Host does not exist E_EXISTED
-2002
Host already exists E_INVALID_HOST
-2003
Invalid host E_UNSUPPORTED
-2004
The current command, statement, or function is not supported E_NOT_DROP
-2005
Not allowed to drop E_CONFIG_IMMUTABLE
-2007
Configuration items cannot be changed E_CONFLICT
-2008
Parameters conflict with meta data E_INVALID_PARM
-2009
Invalid parameter E_WRONGCLUSTER
-2010
Wrong cluster E_ZONE_NOT_ENOUGH
-2011
Listener conflicts E_ZONE_IS_EMPTY
-2012
Host not exist E_SCHEMA_NAME_EXISTS
-2013
Schema name already exists E_RELATED_INDEX_EXISTS
-2014
There are still indexes related to tag or edge, cannot drop it E_RELATED_SPACE_EXISTS
-2015
There are still some space on the host, cannot drop it E_STORE_FAILURE
-2021
Failed to store data E_STORE_SEGMENT_ILLEGAL
-2022
Illegal storage segment E_BAD_BALANCE_PLAN
-2023
Invalid data balancing plan E_BALANCED
-2024
The cluster is already in the data balancing status E_NO_RUNNING_BALANCE_PLAN
-2025
There is no running data balancing plan E_NO_VALID_HOST
-2026
Lack of valid hosts E_CORRUPTED_BALANCE_PLAN
-2027
A data balancing plan that has been corrupted E_IMPROPER_ROLE
-2030
Failed to recover user role E_INVALID_PARTITION_NUM
-2031
Number of invalid partitions E_INVALID_REPLICA_FACTOR
-2032
Invalid replica factor E_INVALID_CHARSET
-2033
Invalid character set E_INVALID_COLLATE
-2034
Invalid character sorting rules E_CHARSET_COLLATE_NOT_MATCH
-2035
Character set and character sorting rule mismatch E_SNAPSHOT_FAILURE
-2040
Failed to generate a snapshot E_BLOCK_WRITE_FAILURE
-2041
Failed to write block data E_ADD_JOB_FAILURE
-2044
Failed to add new task E_STOP_JOB_FAILURE
-2045
Failed to stop task E_SAVE_JOB_FAILURE
-2046
Failed to save task information E_BALANCER_FAILURE
-2047
Data balancing failed E_JOB_NOT_FINISHED
-2048
The current task has not been completed E_TASK_REPORT_OUT_DATE
-2049
Task report failed E_JOB_NOT_IN_SPACE
-2050
The current task is not in the graph space E_JOB_NEED_RECOVER
-2051
The current task needs to be resumed E_JOB_ALREADY_FINISH
-2052
The job status has already been failed or finished E_JOB_SUBMITTED
-2053
Job default status E_JOB_NOT_STOPPABLE
-2054
The given job do not support stop E_JOB_HAS_NO_TARGET_STORAGE
-2055
The leader distribution has not been reported, so can't send task to storage E_INVALID_JOB
-2065
Invalid task E_BACKUP_BUILDING_INDEX
-2066
Backup terminated (index being created) E_BACKUP_SPACE_NOT_FOUND
-2067
Graph space does not exist at the time of backup E_RESTORE_FAILURE
-2068
Backup recovery failed E_SESSION_NOT_FOUND
-2069
Session does not exist E_LIST_CLUSTER_FAILURE
-2070
Failed to get cluster information E_LIST_CLUSTER_GET_ABS_PATH_FAILURE
-2071
Failed to get absolute path when getting cluster information E_LIST_CLUSTER_NO_AGENT_FAILURE
-2072
Unable to get an agent when getting cluster information E_QUERY_NOT_FOUND
-2073
Query not found E_AGENT_HB_FAILUE
-2074
Failed to receive heartbeat from agent E_HOST_CAN_NOT_BE_ADDED
-2082
The host can not be added for it's not a storage host E_ACCESS_ES_FAILURE
-2090
Failed to access elasticsearch E_GRAPH_MEMORY_EXCEEDED
-2600
Graph memory exceeded E_CONSENSUS_ERROR
-3001
Consensus cannot be reached during an election E_KEY_HAS_EXISTS
-3002
Key already exists E_DATA_TYPE_MISMATCH
-3003
Data type mismatch E_INVALID_FIELD_VALUE
-3004
Invalid field value E_INVALID_OPERATION
-3005
Invalid operation E_NOT_NULLABLE
-3006
Current value is not allowed to be empty E_FIELD_UNSET
-3007
Field value must be set if the field value is NOT NULL
or has no default value E_OUT_OF_RANGE
-3008
The value is out of the range of the current type E_DATA_CONFLICT_ERROR
-3010
Data conflict E_WRITE_STALLED
-3011
Writes are delayed E_IMPROPER_DATA_TYPE
-3021
Incorrect data type E_INVALID_SPACEVIDLEN
-3022
Invalid VID length E_INVALID_FILTER
-3031
Invalid filter E_INVALID_UPDATER
-3032
Invalid field update E_INVALID_STORE
-3033
Invalid KV storage E_INVALID_PEER
-3034
Peer invalid E_RETRY_EXHAUSTED
-3035
Out of retries E_TRANSFER_LEADER_FAILED
-3036
Leader change failed E_INVALID_STAT_TYPE
-3037
Invalid stat type E_INVALID_VID
-3038
VID is invalid E_LOAD_META_FAILED
-3040
Failed to load meta information E_FAILED_TO_CHECKPOINT
-3041
Failed to generate checkpoint E_CHECKPOINT_BLOCKED
-3042
Generating checkpoint is blocked E_FILTER_OUT
-3043
Data is filtered E_INVALID_DATA
-3044
Invalid data E_MUTATE_EDGE_CONFLICT
-3045
Concurrent write conflicts on the same edge E_MUTATE_TAG_CONFLICT
-3046
Concurrent write conflict on the same vertex E_OUTDATED_LOCK
-3047
Lock is invalid E_INVALID_TASK_PARA
-3051
Invalid task parameter E_USER_CANCEL
-3052
The user canceled the task E_TASK_EXECUTION_FAILED
-3053
Task execution failed E_PLAN_IS_KILLED
-3060
Execution plan was cleared E_NO_TERM
-3070
The heartbeat process was not completed when the request was received E_OUTDATED_TERM
-3071
Out-of-date heartbeat received from the old leader (the new leader has been elected) E_WRITE_WRITE_CONFLICT
-3073
Concurrent write conflicts with later requests E_RAFT_UNKNOWN_PART
-3500
Unknown partition E_RAFT_LOG_GAP
-3501
Raft logs lag behind E_RAFT_LOG_STALE
-3502
Raft logs are out of date E_RAFT_TERM_OUT_OF_DATE
-3503
Heartbeat messages are out of date E_RAFT_UNKNOWN_APPEND_LOG
-3504
Unknown additional logs E_RAFT_WAITING_SNAPSHOT
-3511
Waiting for the snapshot to complete E_RAFT_SENDING_SNAPSHOT
-3512
There was an error sending the snapshot E_RAFT_INVALID_PEER
-3513
Invalid receiver E_RAFT_NOT_READY
-3514
Raft did not start E_RAFT_STOPPED
-3515
Raft has stopped E_RAFT_BAD_ROLE
-3516
Wrong role E_RAFT_WAL_FAIL
-3521
Write to a WAL failed E_RAFT_HOST_STOPPED
-3522
The host has stopped E_RAFT_TOO_MANY_REQUESTS
-3523
Too many requests E_RAFT_PERSIST_SNAPSHOT_FAILED
-3524
Persistent snapshot failed E_RAFT_RPC_EXCEPTION
-3525
RPC exception E_RAFT_NO_WAL_FOUND
-3526
No WAL logs found E_RAFT_HOST_PAUSED
-3527
Host suspended E_RAFT_WRITE_BLOCKED
-3528
Writes are blocked E_RAFT_BUFFER_OVERFLOW
-3529
Cache overflow E_RAFT_ATOMIC_OP_FAILED
-3530
Atomic operation failed E_LEADER_LEASE_FAILED
-3531
Leader lease expired E_RAFT_CAUGHT_UP
-3532
Data has been synchronized on Raft E_STORAGE_MEMORY_EXCEEDED
-3600
Storage memory exceeded E_LOG_GAP
-4001
Drainer logs lag behind E_LOG_STALE
-4002
Drainer logs are out of date E_INVALID_DRAINER_STORE
-4003
The drainer data storage is invalid E_SPACE_MISMATCH
-4004
Graph space mismatch E_PART_MISMATCH
-4005
Partition mismatch E_DATA_CONFLICT
-4006
Data conflict E_REQ_CONFLICT
-4007
Request conflict E_DATA_ILLEGAL
-4008
Illegal data E_CACHE_CONFIG_ERROR
-5001
Cache configuration error E_NOT_ENOUGH_SPACE
-5002
Insufficient space E_CACHE_MISS
-5003
No cache hit E_CACHE_WRITE_FAILURE
-5005
Write cache failed E_NODE_NUMBER_EXCEED_LIMIT
-7001
Number of machines exceeded the limit E_PARSING_LICENSE_FAILURE
-7002
Failed to resolve certificate E_UNKNOWN
-8000
Unknown error"},{"location":"20.appendix/history/","title":"History timeline for NebulaGraph","text":"2018.9: dutor wrote and submitted the first line of NebulaGraph database code.
2019.5: NebulaGraph v0.1.0-alpha was released as open-source.
NebulaGraph v1.0.0-beta, v1.0.0-rc1, v1.0.0-rc2, v1.0.0-rc3, and v1.0.0-rc4 were released one after another within a year thereafter.
2019.7: NebulaGraph's debut at HBaseCon1. @dangleptr
2020.3: NebulaGraph v2.0 was starting developed in the final stage of v1.0 development.
2020.6: The first major version of NebulaGraph v1.0.0 GA was released.
2021.3: The second major version of NebulaGraph v2.0 GA was released.
2021.8: NebulaGraph v2.5.0 was released.
2021.10: NebulaGraph v2.6.0 was released.
2022.2: NebulaGraph v3.0.0 was released.
2022.4: NebulaGraph v3.1.0 was released.
2022.7: NebulaGraph v3.2.0 was released.
2022.10: NebulaGraph v3.3.0 was released.
2023.2: NebulaGraph v3.4.0 was released.
2023.5: NebulaGraph v3.5.0 was released.
2023.8: NebulaGraph v3.6.0 was released.
NebulaGraph v1.x supports both RocksDB and HBase as its storage engines. NebulaGraph v2.x removes HBase supports.\u00a0\u21a9
The following are the default ports used by NebulaGraph core and peripheral tools.
No. Product / Service Type Default Description 1 NebulaGraph TCP 9669 Graph service RPC daemon listening port. Commonly used for client connections to the Graph service. 2 NebulaGraph TCP 19669 Graph service HTTP port. 3 NebulaGraph TCP 19670 Graph service HTTP/2 port. (Deprecated after version 3.x) 4 NebulaGraph TCP 9559, 95609559
is the RPC daemon listening port for Meta service. Commonly used by Graph and Storage services for querying and updating metadata in the graph database. The neighboring +1
(9560
) port is used for Raft communication between Meta services. 5 NebulaGraph TCP 19559 Meta service HTTP port. 6 NebulaGraph TCP 19560 Meta service HTTP/2 port. (Deprecated after version 3.x) 7 NebulaGraph TCP 9779, 9778, 9780 9779
is the RPC daemon listening port for Storage service. Commonly used by Graph services for data storage-related operations, such as reading, writing, or deleting data. The neighboring ports -1
(9778
) and +1
(9780
) are also used. 9778
: The port used by the Admin service, which receives Meta commands for Storage. 9780
: The port used for Raft communication between Storage services. 8 NebulaGraph TCP 19779 Storage service HTTP port. 9 NebulaGraph TCP 19780 Storage service HTTP/2 port. (Deprecated after version 3.x) 10 NebulaGraph TCP 8888 Backup and restore Agent service port. The Agent is a daemon running on each machine in the cluster, responsible for starting and stopping NebulaGraph services and uploading and downloading backup files. 11 NebulaGraph TCP 9789, 9788, 9790 9789
is the Raft Listener port for Full-text index, which reads data from Storage services and writes it to the Elasticsearch cluster.Also the port for Storage Listener in inter-cluster data synchronization, used for synchronizing Storage data from the primary cluster. The neighboring ports -1
(9788
) and +1
(9790
) are also used.9788
: An internal port.9790
: The port used for Raft communication. 12 NebulaGraph TCP 9200 NebulaGraph uses this port for HTTP communication with Elasticsearch to perform full-text search queries and manage full-text indexes. 13 NebulaGraph TCP 9569, 9568, 9570 9569
is the Meta Listener port in inter-cluster data synchronization, used for synchronizing Meta data from the primary cluster. The neighboring ports -1
(9568
) and +1
(9570
) are also used.9568
: An internal port.9570
: The port used for Raft communication. 14 NebulaGraph TCP 9889, 9888, 9890 Drainer service port in inter-cluster data synchronization, used for synchronizing Storage and Meta data to the primary cluster. The neighboring ports -1
(9888
) and +1
(9890
) are also used.9888
: An internal port.9890
: The port used for Raft communication. 15 NebulaGraph Studio TCP 7001 Studio web service port. 16 NebulaGraph Dashboard TCP 8090 Nebula HTTP Gateway dependency service port. Provides an HTTP interface for cluster services to interact with the NebulaGraph database using nGQL statements.0 17 NebulaGraph Dashboard TCP 9200 Nebula Stats Exporter dependency service port. Collects cluster performance metrics, including service IP addresses, versions, and monitoring metrics (such as query count, query latency, heartbeat latency, etc.). 18 NebulaGraph Dashboard TCP 9100 Node Exporter dependency service port. Collects resource information for machines in the cluster, including CPU, memory, load, disk, and traffic. 19 NebulaGraph Dashboard TCP 9090 Prometheus service port. Time-series database for storing monitoring data. 20 NebulaGraph Dashboard TCP 7003 Dashboard Community Edition web service port."},{"location":"20.appendix/release-notes/dashboard-comm-release-note/","title":"NebulaGraph Dashboard Community Edition release notes","text":""},{"location":"20.appendix/release-notes/dashboard-comm-release-note/#community_edition_340","title":"Community Edition 3.4.0","text":"machine
.num_queries
, and adjust the display to time series aggregation.Enhance the full-text index. #5567 #5575 #5577 #5580 #5584 #5587
The changes involved are listed below:
DeleteRange
operation. #5525MATCH
statement when querying for non-existent properties. #5634Find All Path
statement. #5621 #5640MATCH
statement causes the all()
function push-down optimization to fail. #5631MATCH
statement that returns incorrect results when querying the self-loop by the shortest path. #5636MATCH
statement that returns missing properties of edges when matching multiple hops. #5646The long-term tasks run by the Storage Service are called jobs, such as COMPACT
, FLUSH
, and STATS
. These jobs can be time-consuming if the data amount in the graph space is large. The job manager helps you run, show, stop, and recover jobs.
Note
All job management commands can be executed only after selecting a graph space.
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_balance_leader","title":"SUBMIT JOB BALANCE LEADER","text":"Starts a job to balance the distribution of all the storage leaders in all graph spaces. It returns the job ID.
For example:
nebula> SUBMIT JOB BALANCE LEADER;\n+------------+\n| New Job Id |\n+------------+\n| 33 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_compact","title":"SUBMIT JOB COMPACT","text":"The SUBMIT JOB COMPACT
statement triggers the long-term RocksDB compact
operation in the current graph space.
For more information about compact
configuration, see Storage Service configuration.
For example:
nebula> SUBMIT JOB COMPACT;\n+------------+\n| New Job Id |\n+------------+\n| 40 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_flush","title":"SUBMIT JOB FLUSH","text":"The SUBMIT JOB FLUSH
statement writes the RocksDB memfile in the memory to the hard disk in the current graph space.
For example:
nebula> SUBMIT JOB FLUSH;\n+------------+\n| New Job Id |\n+------------+\n| 96 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_stats","title":"SUBMIT JOB STATS","text":"The SUBMIT JOB STATS
statement starts a job that makes the statistics of the current graph space. Once this job succeeds, you can use the SHOW STATS
statement to list the statistics. For more information, see SHOW STATS.
Note
If the data stored in the graph space changes, in order to get the latest statistics, you have to run SUBMIT JOB STATS
again.
For example:
nebula> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 9 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#submit_job_downloadingest","title":"SUBMIT JOB DOWNLOAD/INGEST","text":"The SUBMIT JOB DOWNLOAD HDFS
and SUBMIT JOB INGEST
commands are used to import the SST file into NebulaGraph. For detail, see Import data from SST files.
The SUBMIT JOB DOWNLOAD HDFS
command will download the SST file on the specified HDFS.
The SUBMIT JOB INGEST
command will import the downloaded SST file into NebulaGraph.
For example:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://192.168.10.100:9000/sst\";\n+------------+\n| New Job Id |\n+------------+\n| 10 |\n+------------+\nnebula> SUBMIT JOB INGEST;\n+------------+\n| New Job Id |\n+------------+\n| 11 |\n+------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#show_job","title":"SHOW JOB","text":"The Meta Service parses a SUBMIT JOB
request into multiple tasks and assigns them to the nebula-storaged processes. The SHOW JOB <job_id>
statement shows the information about a specific job and all its tasks in the current graph space.
job_id
is returned when you run the SUBMIT JOB
statement.
For example:
nebula> SHOW JOB 8;\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n| 8 | \"STATS\" | \"FINISHED\" | 2022-10-18T08:14:45.000000 | 2022-10-18T08:14:45.000000 | \"SUCCEEDED\" |\n| 0 | \"192.168.8.129\" | \"FINISHED\" | 2022-10-18T08:14:45.000000 | 2022-10-18T08:15:13.000000 | \"SUCCEEDED\" |\n| \"Total:1\" | \"Succeeded:1\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+-----------------+------------+----------------------------+----------------------------+-------------+\n
The descriptions are as follows.
Parameter DescriptionJob Id(TaskId)
The first row shows the job ID and the other rows show the task IDs and the last row shows the total number of job-related tasks. Command(Dest)
The first row shows the command executed and the other rows show on which storaged processes the task is running. The last row shows the number of successful tasks related to the job. Status
Shows the status of the job or task. The last row shows the number of failed tasks related to the job. For more information, see Job status. Start Time
Shows a timestamp indicating the time when the job or task enters the RUNNING
phase. The last row shows the number of ongoing tasks related to the job. Stop Time
Shows a timestamp indicating the time when the job or task gets FINISHED
, FAILED
, or STOPPED
. Error Code
The error code of job."},{"location":"3.ngql-guide/4.job-statements/#job_status","title":"Job status","text":"The descriptions are as follows.
Status Description QUEUE The job or task is waiting in a queue. TheStart Time
is empty in this phase. RUNNING The job or task is running. The Start Time
shows the beginning time of this phase. FINISHED The job or task is successfully finished. The Stop Time
shows the time when the job or task enters this phase. FAILED The job or task has failed. The Stop Time
shows the time when the job or task enters this phase. STOPPED The job or task is stopped without running. The Stop Time
shows the time when the job or task enters this phase. REMOVED The job or task is removed. The description of switching the status is described as follows.
Queue -- running -- finished -- removed\n \\ \\ /\n \\ \\ -- failed -- /\n \\ \\ /\n \\ ---------- stopped -/\n
"},{"location":"3.ngql-guide/4.job-statements/#show_jobs","title":"SHOW JOBS","text":"The SHOW JOBS
statement lists all the unexpired jobs in the current graph space.
The default job expiration interval is one week. You can change it by modifying the job_expired_secs
parameter of the Meta Service. For how to modify job_expired_secs
, see Meta Service configuration.
For example:
nebula> SHOW JOBS;\n+--------+---------------------+------------+----------------------------+----------------------------+\n| Job Id | Command | Status | Start Time | Stop Time |\n+--------+---------------------+------------+----------------------------+----------------------------+\n| 34 | \"STATS\" | \"FINISHED\" | 2021-11-01T03:32:27.000000 | 2021-11-01T03:32:27.000000 |\n| 33 | \"FLUSH\" | \"FINISHED\" | 2021-11-01T03:32:15.000000 | 2021-11-01T03:32:15.000000 |\n| 32 | \"COMPACT\" | \"FINISHED\" | 2021-11-01T03:32:06.000000 | 2021-11-01T03:32:06.000000 |\n| 31 | \"REBUILD_TAG_INDEX\" | \"FINISHED\" | 2021-10-29T05:39:16.000000 | 2021-10-29T05:39:17.000000 |\n| 10 | \"COMPACT\" | \"FINISHED\" | 2021-10-26T02:27:05.000000 | 2021-10-26T02:27:05.000000 |\n+--------+---------------------+------------+----------------------------+----------------------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#stop_job","title":"STOP JOB","text":"The STOP JOB <job_id>
statement stops jobs that are not finished in the current graph space.
For example:
nebula> STOP JOB 22;\n+---------------+\n| Result |\n+---------------+\n| \"Job stopped\" |\n+---------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#recover_job","title":"RECOVER JOB","text":"The RECOVER JOB [<job_id>]
statement re-executes the jobs that status is FAILED
or STOPPED
in the current graph space and returns the number of recovered jobs. If <job_id>
is not specified, re-execution is performed from the earliest job and the number of jobs that have been recovered is returned.
For example:
nebula> RECOVER JOB;\n+-------------------+\n| Recovered job num |\n+-------------------+\n| 5 job recovered |\n+-------------------+\n
"},{"location":"3.ngql-guide/4.job-statements/#faq","title":"FAQ","text":""},{"location":"3.ngql-guide/4.job-statements/#how_to_troubleshoot_job_problems","title":"How to troubleshoot job problems?","text":"The SUBMIT JOB
operations use the HTTP port. Please check if the HTTP ports on the machines where the Storage Service is running are working well. You can use the following command to debug.
curl \"http://{storaged-ip}:19779/admin?space={space_name}&op=compact\"\n
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/","title":"NebulaGraph Query Language (nGQL)","text":"This topic gives an introduction to the query language of NebulaGraph, nGQL.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#what_is_ngql","title":"What is nGQL","text":"nGQL is a declarative graph query language for NebulaGraph. It allows expressive and efficient graph patterns. nGQL is designed for both developers and operations professionals. nGQL is an SQL-like query language, so it's easy to learn.
nGQL is a project in progress. New features and optimizations are done steadily. There can be differences between syntax and implementation. Submit an issue to inform the NebulaGraph team if you find a new issue of this type. NebulaGraph 3.0 or later releases will support openCypher 9.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#what_can_ngql_do","title":"What can nGQL do","text":"Users can download the example data Basketballplayer in NebulaGraph. After downloading the example data, you can import it to NebulaGraph by using the -f
option in NebulaGraph Console.
Note
Ensure that you have executed the ADD HOSTS
command to add the Storage service to your NebulaGraph cluster before importing the example data. For more information, see Manage Storage hosts.
Refer to the following standards in nGQL:
In template code, any token that is not a keyword, a literal value, or punctuation is a placeholder identifier or a placeholder value.
For details of the symbols in nGQL syntax, see the following table:
Token Meaning < > name of a syntactic element : formula that defines an element [ ] optional elements { } explicitly specified elements | complete alternative elements ... may be repeated any number of timesFor example, create vertices in nGQL syntax:
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES <vid>: ([prop_value_list])\ntag_props:\n tag_name ([prop_name_list])\nprop_name_list:\n [prop_name [, prop_name] ...]\nprop_value_list:\n [prop_value [, prop_value] ...] \n
Example statement:
nebula> CREATE TAG IF NOT EXISTS player(name string, age int);\n
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#about_opencypher_compatibility","title":"About openCypher compatibility","text":""},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#native_ngql_and_opencypher","title":"Native nGQL and openCypher","text":"Native nGQL is the part of a graph query language designed and implemented by NebulaGraph. OpenCypher is a graph query language maintained by openCypher Implementers Group.
The latest release is openCypher 9. The compatible parts of openCypher in nGQL are called openCypher compatible sentences (short as openCypher).
Note
nGQL
= native nGQL
+ openCypher compatible sentences
NO.
Compatibility with openCypher
nGQL is designed to be compatible with part of DQL (match, optional match, with, etc.).
Users can search in this manual with the keyword compatibility
to find major compatibility issues.
Multiple known incompatible items are listed in NebulaGraph Issues. Submit an issue with the incompatible
tag if you find a new issue of this type.
The following are some major differences (by design incompatible) between nGQL and openCypher.
Category openCypher 9 nGQL Schema Optional Schema Strong Schema Equality operator=
==
Math exponentiation ^
^
is not supported. Use pow(x, y) instead. Edge rank No such concept. edge rank (reference by @) Statement - All DMLs (CREATE
, MERGE
, etc) of openCypher 9. Label and tag A label is used for searching a vertex, namely an index of vertex. A tag defines the type of a vertex and its corresponding properties. It cannot be used as an index. Pre-compiling and parameterized queries Support Parameterized queries are supported, but precompiling is not. Compatibility
OpenCypher 9 and Cypher have some differences in grammar and licence. For example,
Cypher requires that All Cypher statements are explicitly run within a transaction. While openCypher has no such requirement. And nGQL does not support transactions.
Cypher has a variety of constraints, including Unique node property constraints, Node property existence constraints, Relationship property existence constraints, and Node key constraints. While OpenCypher has no such constraints. As a strong schema system, most of the constraints mentioned above can be solved through schema definitions (including NOT NULL) in nGQL. The only function that cannot be supported is the UNIQUE constraint.
Cypher has APoC, while openCypher 9 does not have APoC. Cypher has Blot protocol support requirements, while openCypher 9 does not.
Users can find more than 2500 nGQL examples in the features directory on the NebulaGraph GitHub page.
The features
directory consists of .feature
files. Each file records scenarios that you can use as nGQL examples. Here is an example:
Feature: Basic match\n\n Background:\n Given a graph with space named \"basketballplayer\"\n\n Scenario: Single node\n When executing query:\n \"\"\"\n MATCH (v:player {name: \"Yao Ming\"}) RETURN v;\n \"\"\"\n Then the result should be, in any order, with relax comparison:\n | v |\n | (\"player133\" :player{age: 38, name: \"Yao Ming\"}) |\n\n Scenario: One step\n When executing query:\n \"\"\"\n MATCH (v1:player{name: \"LeBron James\"}) -[r]-> (v2)\n RETURN type(r) AS Type, v2.player.name AS Name\n \"\"\"\n Then the result should be, in any order:\n\n | Type | Name |\n | \"follow\" | \"Ray Allen\" |\n | \"serve\" | \"Lakers\" |\n | \"serve\" | \"Heat\" |\n | \"serve\" | \"Cavaliers\" |\n\nFeature: Comparison of where clause\n\n Background:\n Given a graph with space named \"basketballplayer\"\n\n Scenario: push edge props filter down\n When profiling query:\n \"\"\"\n GO FROM \"player100\" OVER follow \n WHERE properties(edge).degree IN [v IN [95,99] WHERE v > 0] \n YIELD dst(edge), properties(edge).degree\n \"\"\"\n Then the result should be, in any order:\n | follow._dst | follow.degree |\n | \"player101\" | 95 |\n | \"player125\" | 95 |\n And the execution plan should be:\n | id | name | dependencies | operator info |\n | 0 | Project | 1 | |\n | 1 | GetNeighbors | 2 | {\"filter\": \"(properties(edge).degree IN [v IN [95,99] WHERE (v>0)])\"} |\n | 2 | Start | | |\n
The keywords in the preceding example are described as follows.
Keyword DescriptionFeature
Describes the topic of the current .feature
file. Background
Describes the background information of the current .feature
file. Given
Describes the prerequisites of running the test statements in the current .feature
file. Scenario
Describes the scenarios. If there is the @skip
before one Scenario
, this scenario may not work and do not use it as a working example in a production environment. When
Describes the nGQL statement to be executed. It can be a executing query
or profiling query
. Then
Describes the expected return results of running the statement in the When
clause. If the return results in your environment do not match the results described in the .feature
file, submit an issue to inform the NebulaGraph team. And
Describes the side effects of running the statement in the When
clause. @skip
This test case will be skipped. Commonly, the to-be-tested code is not ready. Welcome to add more tck case and return automatically to the using statements in CI/CD.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#does_it_support_tinkerpop_gremlin","title":"Does it support TinkerPop Gremlin?","text":"No. And no plan to support that.
"},{"location":"3.ngql-guide/1.nGQL-overview/1.overview/#does_nebulagraph_support_w3c_rdf_sparql_or_graphql","title":"Does NebulaGraph support W3C RDF (SPARQL) or GraphQL?","text":"No. And no plan to support that.
The data model of NebulaGraph is the property graph. And as a strong schema system, NebulaGraph does not support RDF.
NebulaGraph Query Language does not support SPARQL
nor GraphQL
.
Patterns and graph pattern matching are the very heart of a graph query language. This topic will describe the patterns in NebulaGraph, some of which have not yet been implemented.
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_vertices","title":"Patterns for vertices","text":"A vertex is described using a pair of parentheses and is typically given a name. For example:
(a)\n
This simple pattern describes a single vertex and names that vertex using the variable a
.
A more powerful construct is a pattern that describes multiple vertices and edges between them. Patterns describe an edge by employing an arrow between two vertices. For example:
(a)-[]->(b)\n
This pattern describes a very simple data structure: two vertices and a single edge from one to the other. In this example, the two vertices are named as a
and b
respectively and the edge is directed
: it goes from a
to b
.
This manner of describing vertices and edges can be extended to cover an arbitrary number of vertices and the edges between them, for example:
(a)-[]->(b)<-[]-(c)\n
Such a series of connected vertices and edges is called a path
.
Note that the naming of the vertices in these patterns is only necessary when one needs to refer to the same vertex again, either later in the pattern or elsewhere in the query. If not, the name may be omitted as follows:
(a)-[]->()<-[]-(c)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_tags","title":"Patterns for tags","text":"Note
The concept of tag
in nGQL has a few differences from that of label
in openCypher. For example, users must create a tag
before using it. And a tag
also defines the type of properties.
In addition to simply describing the vertices in the graphs, patterns can also describe the tags of the vertices. For example:
(a:User)-[]->(b)\n
Patterns can also describe a vertex that has multiple tags. For example:
(a:User:Admin)-[]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_properties","title":"Patterns for properties","text":"Vertices and edges are the fundamental elements in a graph. In nGQL, properties are added to them for richer models.
In the patterns, the properties can be expressed as follows: some key-value pairs are enclosed in curly brackets and separated by commas, and the tag or edge type to which a property belongs must be specified.
For example, a vertex with two properties will be like:
(a:player{name: \"Tim Duncan\", age: 42})\n
One of the edges that connect to this vertex can be like:
(a)-[e:follow{degree: 95}]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#patterns_for_edges","title":"Patterns for edges","text":"The simplest way to describe an edge is by using the arrow between two vertices, as in the previous examples.
Users can describe an edge and its direction using the following statement. If users do not care about its direction, the arrowhead can be omitted. For example:
(a)-[]-(b)\n
Like vertices, edges can also be named. A pair of square brackets will be used to separate the arrow and the variable will be placed between them. For example:
(a)-[r]->(b)\n
Like the tags on vertices, edges can also have types. To describe an edge with a specific type, use the pattern as follows:
(a)-[r:REL_TYPE]->(b)\n
An edge can only have one edge type. But if we'd like to describe some data such that the edge could have a set of types, then they can all be listed in the pattern, separating them with the pipe symbol |
like this:
(a)-[r:TYPE1|TYPE2]->(b)\n
Like vertices, the name of an edge can be omitted. For example:
(a)-[:REL_TYPE]->(b)\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#variable-length_pattern","title":"Variable-length pattern","text":"Rather than describing a long path using a sequence of many vertex and edge descriptions in a pattern, many edges (and the intermediate vertices) can be described by specifying a length in the edge description of a pattern. For example:
(a)-[*2]->(b)\n
The following pattern describes a graph of three vertices and two edges, all in one path (a path of length 2). It is equivalent to:
(a)-[]->()-[]->(b)\n
The range of lengths can also be specified. Such edge patterns are called variable-length edges
. For example:
(a)-[*3..5]->(b)\n
The preceding example defines a path with a minimum length of 3 and a maximum length of 5.
It describes a graph of either 4 vertices and 3 edges, 5 vertices and 4 edges, or 6 vertices and 5 edges, all connected in a single path.
You may specify either the upper limit or lower limit of the length range, or neither of them, for example:
(a)-[*..5]->(b) // The minimum length is 1 and the maximum length is 5.\n(a)-[*3..]->(b) // The minimum length is 3 and the maximum length is infinity.\n(a)-[*]->(b) // The minimum length is 1 and the maximum length is infinity.\n
"},{"location":"3.ngql-guide/1.nGQL-overview/3.graph-patterns/#assigning_to_path_variables","title":"Assigning to path variables","text":"As described above, a series of connected vertices and edges is called a path
. nGQL allows paths to be named using variables. For example:
p = (a)-[*3..5]->(b)\n
Users can do this in the MATCH
statement.
This topic will describe the comments in nGQL.
Legacy version compatibility
#
, --
, //
, /* */
.--
cannot be used as comments.nebula> RETURN 1+1; # This comment continues to the end of this line.\nnebula> RETURN 1+1; // This comment continues to the end of this line.\nnebula> RETURN 1 /* This is an in-line comment. */ + 1 == 2;\nnebula> RETURN 11 + \\\n/* Multi-line comment. \\\nUse a backslash as a line break. \\\n*/ 12;\n
Note
\\
in a line indicates a line break.#
or //
, the statement is not executed and the error StatementEmpty
is returned.\\
at the end of every line, even in multi-line comments /* */
.\\
as a line break./* openCypher style:\nThe following comment\nspans more than\none line */\nMATCH (n:label)\nRETURN n;\n
/* nGQL style: \\\nThe following comment \\\nspans more than \\\none line */ \\\nMATCH (n:tag) \\\nRETURN n;\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/","title":"Identifier case sensitivity","text":""},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#identifiers_are_case-sensitive","title":"Identifiers are Case-Sensitive","text":"The following statements will not work because they refer to two different spaces, i.e. my_space
and MY_SPACE
.
nebula> CREATE SPACE IF NOT EXISTS my_space (vid_type=FIXED_STRING(30));\nnebula> use MY_SPACE;\n[ERROR (-1005)]: SpaceNotFound:\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#keywords_and_reserved_words_are_case-insensitive","title":"Keywords and Reserved Words are Case-Insensitive","text":"The following statements are equivalent since show
and spaces
are keywords.
nebula> show spaces; \nnebula> SHOW SPACES;\nnebula> SHOW spaces;\nnebula> show SPACES;\n
"},{"location":"3.ngql-guide/1.nGQL-overview/identifier-case-sensitivity/#functions_are_case-insensitive","title":"Functions are Case-Insensitive","text":"Functions are case-insensitive. For example, count()
, COUNT()
, and couNT()
are equivalent.
nebula> WITH [NULL, 1, 1, 2, 2] As a \\\n UNWIND a AS b \\\n RETURN count(b), COUNT(*), couNT(DISTINCT b);\n+----------+----------+-------------------+\n| count(b) | COUNT(*) | couNT(distinct b) |\n+----------+----------+-------------------+\n| 4 | 5 | 2 |\n+----------+----------+-------------------+\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/","title":"Keywords","text":"Keywords in nGQL are words with particular meanings, such as CREATE
and TAG
in the CREATE TAG
statement. Keywords that require special processing to be used as identifiers are referred to as reserved keywords
, while the part of keywords that can be used directly as identifiers are called non-reserved keywords
.
It is not recommended to use keywords to identify schemas. If you must use keywords as identifiers, pay attention to the following restrictions:
To use non-reserved keywords as identifiers:
Note
Keywords are case-insensitive.
nebula> CREATE TAG TAG(name string);\n[ERROR (-1004)]: SyntaxError: syntax error near `TAG'\n\nnebula> CREATE TAG `TAG` (name string);\nExecution succeeded\n\nnebula> CREATE TAG SPACE(name string);\nExecution succeeded\n\nnebula> CREATE TAG \u4e2d\u6587(\u7b80\u4f53 string);\nExecution succeeded\n\nnebula> CREATE TAG `\uffe5%special characters&*+-*/` (`q~\uff01\uff08\uff09= wer` string);\nExecution succeeded\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/#reserved_keywords","title":"Reserved keywords","text":"ACROSS\nADD\nALTER\nAND\nAS\nASC\nASCENDING\nBALANCE\nBOOL\nBY\nCASE\nCHANGE\nCOMPACT\nCREATE\nDATE\nDATETIME\nDELETE\nDESC\nDESCENDING\nDESCRIBE\nDISTINCT\nDOUBLE\nDOWNLOAD\nDROP\nDURATION\nEDGE\nEDGES\nEXISTS\nEXPLAIN\nFALSE\nFETCH\nFIND\nFIXED_STRING\nFLOAT\nFLUSH\nFROM\nGEOGRAPHY\nGET\nGO\nGRANT\nIF\nIGNORE_EXISTED_INDEX\nIN\nINDEX\nINDEXES\nINGEST\nINSERT\nINT\nINT16\nINT32\nINT64\nINT8\nINTERSECT\nIS\nJOIN\nLEFT\nLIST\nLOOKUP\nMAP\nMATCH\nMINUS\nNO\nNOT\nNULL\nOF\nON\nOR\nORDER\nOVER\nOVERWRITE\nPATH\nPROP\nREBUILD\nRECOVER\nREMOVE\nRESTART\nRETURN\nREVERSELY\nREVOKE\nSET\nSHOW\nSTEP\nSTEPS\nSTOP\nSTRING\nSUBMIT\nTAG\nTAGS\nTIME\nTIMESTAMP\nTO\nTRUE\nUNION\nUNWIND\nUPDATE\nUPSERT\nUPTO\nUSE\nVERTEX\nVERTICES\nWHEN\nWHERE\nWITH\nXOR\nYIELD\n
"},{"location":"3.ngql-guide/1.nGQL-overview/keywords-and-reserved-words/#non-reserved_keywords","title":"Non-reserved keywords","text":"ACCOUNT\nADMIN\nAGENT\nALL\nALLSHORTESTPATHS\nANALYZER\nANY\nATOMIC_EDGE\nAUTO\nBASIC\nBIDIRECT\nBOTH\nCHARSET\nCLEAR\nCLIENTS\nCOLLATE\nCOLLATION\nCOMMENT\nCONFIGS\nCONTAINS\nDATA\nDBA\nDEFAULT\nDIVIDE\nDRAINER\nDRAINERS\nELASTICSEARCH\nELSE\nEND\nENDS\nES_QUERY\nFORCE\nFORMAT\nFULLTEXT\nGOD\nGRANTS\nGRAPH\nGROUP\nGROUPS\nGUEST\nHDFS\nHOST\nHOSTS\nHTTP\nHTTPS\nINTO\nIP\nJOB\nJOBS\nKILL\nLEADER\nLIMIT\nLINESTRING\nLISTENER\nLOCAL\nMERGE\nMETA\nNEW\nNOLOOP\nNONE\nOFFSET\nOPTIONAL\nOUT\nPART\nPARTITION_NUM\nPARTS\nPASSWORD\nPLAN\nPOINT\nPOLYGON\nPROFILE\nQUERIES\nQUERY\nREAD\nREDUCE\nRENAME\nREPLICA_FACTOR\nRESET\nROLE\nROLES\nS2_MAX_CELLS\nS2_MAX_LEVEL\nSAMPLE\nSEARCH\nSERVICE\nSESSION\nSESSIONS\nSHORTEST\nSHORTESTPATH\nSIGN\nSINGLE\nSKIP\nSNAPSHOT\nSNAPSHOTS\nSPACE\nSPACES\nSTARTS\nSTATS\nSTATUS\nSTORAGE\nSUBGRAPH\nSYNC\nTEXT\nTEXT_SEARCH\nTHEN\nTOP\nTTL_COL\nTTL_DURATION\nUSER\nUSERS\nUUID\nVALUE\nVALUES\nVARIABLES\nVID_TYPE\nWHITELIST\nWRITE\nZONE\nZONES\n
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/","title":"nGQL style guide","text":"nGQL does not have strict formatting requirements, but creating nGQL statements according to an appropriate and uniform style can improve readability and avoid ambiguity. Using the same nGQL style in the same organization or project helps reduce maintenance costs and avoid problems caused by format confusion or misunderstanding. This topic will provide a style guide for writing nGQL statements.
Compatibility
The styles of nGQL and Cypher Style Guide are different.
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#newline","title":"Newline","text":"Start a new line to write a clause.
Not recommended:
GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id;\n
Recommended:
GO FROM \"player100\" \\\nOVER follow REVERSELY \\\nYIELD src(edge) AS id;\n
Start a new line to write different statements in a composite statement.
Not recommended:
GO FROM \"player100\" OVER follow REVERSELY YIELD src(edge) AS id | GO FROM $-.id \\\nOVER serve WHERE properties($^).age > 20 YIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n
Recommended:
GO FROM \"player100\" \\\nOVER follow REVERSELY \\\nYIELD src(edge) AS id | \\\nGO FROM $-.id OVER serve \\\nWHERE properties($^).age > 20 \\\nYIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n
If the clause exceeds 80 characters, start a new line at the appropriate place.
Not recommended:
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\nWHERE (v2.player.name STARTS WITH \"Y\" AND v2.player.age > 35 AND v2.player.age < v.player.age) OR (v2.player.name STARTS WITH \"T\" AND v2.player.age < 45 AND v2.player.age > v.player.age) \\\nRETURN v2;\n
Recommended:
MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\nWHERE (v2.player.name STARTS WITH \"Y\" AND v2.player.age > 35 AND v2.player.age < v.player.age) \\\nOR (v2.player.name STARTS WITH \"T\" AND v2.player.age < 45 AND v2.player.age > v.player.age) \\\nRETURN v2;\n
Note
If needed, you can also start a new line for better understanding, even if the clause does not exceed 80 characters.
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#identifier_naming","title":"Identifier naming","text":"In nGQL statements, characters other than keywords, punctuation marks, and blanks are all identifiers. Recommended methods to name the identifiers are as follows.
Use singular nouns to name tags, and use the base form of verbs or verb phrases to form Edge types.
Not recommended:
MATCH p=(v:players)-[e:are_following]-(v2) \\\nRETURN nodes(p);\n
Recommended:
MATCH p=(v:player)-[e:follow]-(v2) \\\nRETURN nodes(p);\n
Use the snake case to name identifiers, and connect words with underscores (_) with all the letters lowercase.
Not recommended:
MATCH (v:basketballTeam) \\\nRETURN v;\n
Recommended:
MATCH (v:basketball_team) \\\nRETURN v;\n
Use uppercase keywords and lowercase variables.
Not recommended:
match (V:player) return V limit 5;\n
Recommended:
MATCH (v:player) RETURN v LIMIT 5;\n
Start a new line on the right side of the arrow indicating an edge when writing patterns.
Not recommended:
MATCH (v:player{name: \"Tim Duncan\", age: 42}) \\\n-[e:follow]->()-[e:serve]->()<--(v2) \\\nRETURN v, e, v2;\n
Recommended:
MATCH (v:player{name: \"Tim Duncan\", age: 42})-[e:follow]-> \\\n()-[e:serve]->()<--(v2) \\\nRETURN v, e, v2;\n
Anonymize the vertices and edges that do not need to be queried.
Not recommended:
MATCH (v:player)-[e:follow]->(v2) \\\nRETURN v;\n
Recommended:
MATCH (v:player)-[:follow]->() \\\nRETURN v;\n
Place named vertices in front of anonymous vertices.
Not recommended:
MATCH ()-[:follow]->(v) \\\nRETURN v;\n
Recommended:
MATCH (v)<-[:follow]-() \\\nRETURN v;\n
The strings should be surrounded by double quotes.
Not recommended:
RETURN 'Hello Nebula!';\n
Recommended:
RETURN \"Hello Nebula!\\\"123\\\"\";\n
Note
When single or double quotes need to be nested in a string, use a backslash () to escape. For example:
RETURN \"\\\"NebulaGraph is amazing,\\\" the user says.\";\n
"},{"location":"3.ngql-guide/1.nGQL-overview/ngql-style-guide/#statement_termination","title":"Statement termination","text":"End the nGQL statements with an English semicolon (;).
Not recommended:
FETCH PROP ON player \"player100\" YIELD properties(vertex)\n
Recommended:
FETCH PROP ON player \"player100\" YIELD properties(vertex);\n
Use a pipe (|) to separate a composite statement, and end the statement with an English semicolon at the end of the last line. Using an English semicolon before a pipe will cause the statement to fail.
Not supported:
GO FROM \"player100\" \\\nOVER follow \\\nYIELD dst(edge) AS id; | \\\nGO FROM $-.id \\\nOVER serve \\\nYIELD properties($$).name AS Team, properties($^).name AS Player;\n
Supported:
GO FROM \"player100\" \\\nOVER follow \\\nYIELD dst(edge) AS id | \\\nGO FROM $-.id \\\nOVER serve \\\nYIELD properties($$).name AS Team, properties($^).name AS Player;\n
In a composite statement that contains user-defined variables, use an English semicolon to end the statements that define the variables. If you do not follow the rules to add a semicolon or use a pipe to end the composite statement, the execution will fail.
Not supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
Not supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id | \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
Supported:
$var = GO FROM \"player100\" \\\nOVER follow \\\nYIELD follow._dst AS id; \\\nGO FROM $var.id \\\nOVER serve \\\nYIELD $$.team.name AS Team, $^.player.name AS Player;\n
CREATE TAG
creates a tag with the given name in a graph space.
Tags in nGQL are similar to labels in openCypher. But they are also quite different. For example, the ways to create them are different.
CREATE
statements.CREATE TAG
statements. Tags in nGQL are more like tables in MySQL.Running the CREATE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
To create a tag in a specific graph space, you must specify the current working space with the USE
statement.
CREATE TAG [IF NOT EXISTS] <tag_name>\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
Parameter Description IF NOT EXISTS
Detects if the tag that you want to create exists. If it does not exist, a new one will be created. The tag existence detection here only compares the tag names (excluding properties). <tag_name>
1. Each tag name in the graph space must be unique. 2. Tag names cannot be modified after they are set. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name a tag in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a tag name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <prop_name>
The name of the property. It must be unique for each tag. The rules for permitted property names are the same as those for tag names. <data_type>
Shows the data type of each property. For a full description of the property data types, see Data types and Boolean. NULL | NOT NULL
Specifies if the property supports NULL | NOT NULL
. The default value is NULL
. DEFAULT
Specifies a default value for a property. The default value can be a literal value or an expression supported by NebulaGraph. If no value is specified, the default value is used when inserting a new vertex. COMMENT
The remarks of a certain property or the tag itself. The maximum length is 256 bytes. By default, there will be no comments on a tag. TTL_DURATION
Specifies the life cycle for the property. The property that exceeds the specified TTL expires. The expiration threshold is the TTL_COL
value plus the TTL_DURATION
. The default value of TTL_DURATION
is 0
. It means the data never expires. TTL_COL
Specifies the property to set a timeout on. The data type of the property must be int
or timestamp
. A tag can only specify one field as TTL_COL
. For more information on TTL, see TTL options."},{"location":"3.ngql-guide/10.tag-statements/1.create-tag/#examples","title":"Examples","text":"nebula> CREATE TAG IF NOT EXISTS player(name string, age int);\n\n# The following example creates a tag with no properties.\nnebula> CREATE TAG IF NOT EXISTS no_property();\u00a0\n\n# The following example creates a tag with a default value.\nnebula> CREATE TAG IF NOT EXISTS player_with_default(name string, age int DEFAULT 20);\n\n# In the following example, the TTL of the create_time field is set to be 100 seconds.\nnebula> CREATE TAG IF NOT EXISTS woman(name string, age int, \\\n married bool, salary double, create_time timestamp) \\\n TTL_DURATION = 100, TTL_COL = \"create_time\";\n
"},{"location":"3.ngql-guide/10.tag-statements/1.create-tag/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly created tag may fail because the creation of the tag is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
DROP TAG
drops a tag with the given name in the current working graph space.
A vertex can have one or more tags.
This operation only deletes the Schema data. All the files or directories in the disk will not be deleted directly until the next compaction.
Compatibility
In NebulaGraph master, inserting vertex without tag is not supported by default. If you want to use the vertex without tags, add --graph_use_vertex_key=true
to the configuration files (nebula-graphd.conf
) of all Graph services in the cluster, and add --use_vertex_key=true
to the configuration files (nebula-storaged.conf
) of all Storage services in the cluster.
DROP TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
) will be returned when you run the DROP TAG
statement. To drop an index, see DROP INDEX.DROP TAG [IF EXISTS] <tag_name>;\n
IF NOT EXISTS
: Detects if the tag that you want to drop exists. Only when it exists will it be dropped.tag_name
: Specifies the tag name that you want to drop. You can drop only one tag in one statement.nebula> CREATE TAG IF NOT EXISTS test(p1 string, p2 int);\nnebula> DROP TAG test;\n
"},{"location":"3.ngql-guide/10.tag-statements/3.alter-tag/","title":"ALTER TAG","text":"ALTER TAG
alters the structure of a tag with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration.
ALTER TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
will occur when you ALTER TAG
. For more information on dropping an index, see DROP INDEX.ALTER TAG <tag_name>\n <alter_definition> [[, alter_definition] ...]\n [ttl_definition [, ttl_definition] ... ]\n [COMMENT '<comment>'];\n\nalter_definition:\n| ADD (prop_name data_type [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'])\n| DROP (prop_name)\n| CHANGE (prop_name data_type [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>'])\n\nttl_definition:\n TTL_DURATION = ttl_duration, TTL_COL = prop_name\n
tag_name
: Specifies the tag name that you want to alter. You can alter only one tag in one statement. Before you alter a tag, make sure that the tag exists in the current working graph space. If the tag does not exist, an error will occur when you alter it.ADD
, DROP
, and CHANGE
clauses are permitted in a single ALTER TAG
statement, separated by commas.NOT NULL
using ADD
or CHANGE
, a default value must be specified for the property, that is, the value of DEFAULT
must be specified.When using CHANGE
to modify the data type of a property:
FIXED_STRING
or an INT
can be increased. The length of a STRING
or an INT
cannot be decreased.nebula> CREATE TAG IF NOT EXISTS t1 (p1 string, p2 int);\nnebula> ALTER TAG t1 ADD (p3 int32, fixed_string(10));\nnebula> ALTER TAG t1 TTL_DURATION = 2, TTL_COL = \"p2\";\nnebula> ALTER TAG t1 COMMENT = 'test1';\nnebula> ALTER TAG t1 ADD (p5 double NOT NULL DEFAULT 0.4 COMMENT 'p5') COMMENT='test2';\n// Change the data type of p3 in the TAG t1 from INT32 to INT64, and that of p4 from FIXED_STRING(10) to STRING.\nnebula> ALTER TAG t1 CHANGE (p3 int64, p4 string);\n[ERROR(-1005)]: Unsupported!\n
"},{"location":"3.ngql-guide/10.tag-statements/3.alter-tag/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly altered tag may fail because the alteration of the tag is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
The SHOW TAGS
statement shows the name of all tags in the current graph space.
You do not need any privileges for the graph space to run the SHOW TAGS
statement. But the returned results are different based on role privileges.
SHOW TAGS;\n
"},{"location":"3.ngql-guide/10.tag-statements/4.show-tags/#examples","title":"Examples","text":"nebula> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"team\" |\n+----------+\n
"},{"location":"3.ngql-guide/10.tag-statements/5.describe-tag/","title":"DESCRIBE TAG","text":"DESCRIBE TAG
returns the information about a tag with the given name in a graph space, such as field names, data type, and so on.
Running the DESCRIBE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DESC[RIBE] TAG <tag_name>;\n
You can use DESC
instead of DESCRIBE
for short.
nebula> DESCRIBE TAG player;\n+--------+----------+-------+---------+---------+\n| Field | Type | Null | Default | Comment |\n+--------+----------+-------+---------+---------+\n| \"name\" | \"string\" | \"YES\" | | |\n| \"age\" | \"int64\" | \"YES\" | | |\n+--------+----------+-------+---------+---------+\n
"},{"location":"3.ngql-guide/10.tag-statements/6.delete-tag/","title":"DELETE TAG","text":"DELETE TAG
deletes a tag with the given name on a specified vertex.
Running the DELETE TAG
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DELETE TAG <tag_name_list> FROM <VID_list>;\n
tag_name_list
: The names of the tags you want to delete. Multiple tags are separated with commas (,). *
means all tags.VID
: The VIDs of the vertices from which you want to delete the tags. Multiple VIDs are separated with commas (,). nebula> CREATE TAG IF NOT EXISTS test1(p1 string, p2 int);\nnebula> CREATE TAG IF NOT EXISTS test2(p3 string, p4 int);\nnebula> INSERT VERTEX test1(p1, p2),test2(p3, p4) VALUES \"test\":(\"123\", 1, \"456\", 2);\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+------------------------------------------------------------+\n| v |\n+------------------------------------------------------------+\n| (\"test\" :test1{p1: \"123\", p2: 1} :test2{p3: \"456\", p4: 2}) |\n+------------------------------------------------------------+\nnebula> DELETE TAG test1 FROM \"test\";\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+-----------------------------------+\n| v |\n+-----------------------------------+\n| (\"test\" :test2{p3: \"456\", p4: 2}) |\n+-----------------------------------+\nnebula> DELETE TAG * FROM \"test\";\nnebula> FETCH PROP ON * \"test\" YIELD vertex AS v;\n+---+\n| v |\n+---+\n+---+\n
Compatibility
REMOVE v:LABEL
to delete the tag LABEL
of the vertex v
.DELETE TAG
and DROP TAG
have the same semantics but different syntax. In nGQL, use DELETE TAG
.OpenCypher has the features of SET label
and REMOVE label
to speed up the process of querying or labeling.
NebulaGraph achieves the same operations by creating and inserting tags to an existing vertex, which can quickly query vertices based on the tag name. Users can also run DELETE TAG
to delete some vertices that are no longer needed.
For example, in the basketballplayer
data set, some basketball players are also team shareholders. Users can create an index for the shareholder tag shareholder
for quick search. If the player is no longer a shareholder, users can delete the shareholder tag of the corresponding player by DELETE TAG
.
//This example creates the shareholder tag and index.\nnebula> CREATE TAG IF NOT EXISTS shareholder();\nnebula> CREATE TAG INDEX IF NOT EXISTS shareholder_tag on shareholder();\n\n//This example adds a tag on the vertex.\nnebula> INSERT VERTEX shareholder() VALUES \"player100\":();\nnebula> INSERT VERTEX shareholder() VALUES \"player101\":();\n\n//This example queries all the shareholders.\nnebula> MATCH (v:shareholder) RETURN v;\n+--------------------------------------------------------------------+\n| v |\n+--------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :shareholder{}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"} :shareholder{}) |\n+--------------------------------------------------------------------+\n\nnebula> LOOKUP ON shareholder YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n+-------------+\n\n//In this example, the \"player100\" is no longer a shareholder.\nnebula> DELETE TAG shareholder FROM \"player100\";\nnebula> LOOKUP ON shareholder YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player101\" |\n+-------------+\n
Note
If the index is created after inserting the test data, use the REBUILD TAG INDEX <index_name_list>;
statement to rebuild the index.
CREATE EDGE
creates an edge type with the given name in a graph space.
Edge types in nGQL are similar to relationship types in openCypher. But they are also quite different. For example, the ways to create them are different.
CREATE
statements.CREATE EDGE
statements. Edge types in nGQL are more like tables in MySQL.Running the CREATE EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
To create an edge type in a specific graph space, you must specify the current working space with the USE
statement.
CREATE EDGE [IF NOT EXISTS] <edge_type_name>\n (\n <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']\n [{, <prop_name> <data_type> [NULL | NOT NULL] [DEFAULT <default_value>] [COMMENT '<comment>']} ...] \n )\n [TTL_DURATION = <ttl_duration>]\n [TTL_COL = <prop_name>]\n [COMMENT = '<comment>'];\n
Parameter Description IF NOT EXISTS
Detects if the edge type that you want to create exists. If it does not exist, a new one will be created. The edge type existence detection here only compares the edge type names (excluding properties). <edge_type_name>
1. The edge type name must be unique in a graph space. 2. Once the edge type name is set, it can not be altered. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name an edge type in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in an edge type name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <prop_name>
The name of the property. It must be unique for each edge type. The rules for permitted property names are the same as those for edge type names. <data_type>
Shows the data type of each property. For a full description of the property data types, see Data types and Boolean. NULL | NOT NULL
Specifies if the property supports NULL | NOT NULL
. The default value is NULL
. DEFAULT
must be specified if NOT NULL
is set. DEFAULT
Specifies a default value for a property. The default value can be a literal value or an expression supported by NebulaGraph. If no value is specified, the default value is used when inserting a new edge. COMMENT
The remarks of a certain property or the edge type itself. The maximum length is 256 bytes. By default, there will be no comments on an edge type. TTL_DURATION
Specifies the life cycle for the property. The property that exceeds the specified TTL expires. The expiration threshold is the TTL_COL
value plus the TTL_DURATION
. The default value of TTL_DURATION
is 0
. It means the data never expires. TTL_COL
Specifies the property to set a timeout on. The data type of the property must be int
or timestamp
. An edge type can only specify one field as TTL_COL
. For more information on TTL, see TTL options."},{"location":"3.ngql-guide/11.edge-type-statements/1.create-edge/#examples","title":"Examples","text":"nebula> CREATE EDGE IF NOT EXISTS follow(degree int);\n\n# The following example creates an edge type with no properties.\nnebula> CREATE EDGE IF NOT EXISTS no_property();\n\n# The following example creates an edge type with a default value.\nnebula> CREATE EDGE IF NOT EXISTS follow_with_default(degree int DEFAULT 20);\n\n# In the following example, the TTL of the p2 field is set to be 100 seconds.\nnebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int, p3 timestamp) \\\n TTL_DURATION = 100, TTL_COL = \"p2\";\n
"},{"location":"3.ngql-guide/11.edge-type-statements/2.drop-edge/","title":"DROP EDGE","text":"DROP EDGE
drops an edge type with the given name in a graph space.
An edge can have only one edge type. After you drop it, the edge CANNOT be accessed. The edge will be deleted in the next compaction.
This operation only deletes the Schema data. All the files or directories in the disk will not be deleted directly until the next compaction.
"},{"location":"3.ngql-guide/11.edge-type-statements/2.drop-edge/#prerequisites","title":"Prerequisites","text":"DROP EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
) will be returned. To drop an index, see DROP INDEX.DROP EDGE [IF EXISTS] <edge_type_name>\n
IF NOT EXISTS
: Detects if the edge type that you want to drop exists. Only when it exists will it be dropped.edge_type_name
: Specifies the edge type name that you want to drop. You can drop only one edge type in one statement.nebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int);\nnebula> DROP EDGE e1;\n
"},{"location":"3.ngql-guide/11.edge-type-statements/3.alter-edge/","title":"ALTER EDGE","text":"ALTER EDGE
alters the structure of an edge type with the given name in a graph space. You can add or drop properties, and change the data type of an existing property. You can also set a TTL (Time-To-Live) on a property, or change its TTL duration.
ALTER EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.[ERROR (-1005)]: Conflict!
will occur when you ALTER EDGE
. For more information on dropping an index, see DROP INDEX.FIXED_STRING
or an INT
can be increased.ALTER EDGE <edge_type_name>\n <alter_definition> [, alter_definition] ...]\n [ttl_definition [, ttl_definition] ... ]\n [COMMENT = '<comment>'];\n\nalter_definition:\n| ADD (prop_name data_type)\n| DROP (prop_name)\n| CHANGE (prop_name data_type)\n\nttl_definition:\n TTL_DURATION = ttl_duration, TTL_COL = prop_name\n
edge_type_name
: Specifies the edge type name that you want to alter. You can alter only one edge type in one statement. Before you alter an edge type, make sure that the edge type exists in the graph space. If the edge type does not exist, an error occurs when you alter it.ADD
, DROP
, and CHANGE
clauses are permitted in a single ALTER EDGE
statement, separated by commas.NOT NULL
using ADD
or CHANGE
, a default value must be specified for the property, that is, the value of DEFAULT
must be specified.nebula> CREATE EDGE IF NOT EXISTS e1(p1 string, p2 int);\nnebula> ALTER EDGE e1 ADD (p3 int, p4 string);\nnebula> ALTER EDGE e1 TTL_DURATION = 2, TTL_COL = \"p2\";\nnebula> ALTER EDGE e1 COMMENT = 'edge1';\n
"},{"location":"3.ngql-guide/11.edge-type-statements/3.alter-edge/#implementation_of_the_operation","title":"Implementation of the operation","text":"Trying to use a newly altered edge type may fail because the alteration of the edge type is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.
To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services.
SHOW EDGES
shows all edge types in the current graph space.
You do not need any privileges for the graph space to run the SHOW EDGES
statement. But the returned results are different based on role privileges.
SHOW EDGES;\n
"},{"location":"3.ngql-guide/11.edge-type-statements/4.show-edges/#example","title":"Example","text":"nebula> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\n
"},{"location":"3.ngql-guide/11.edge-type-statements/5.describe-edge/","title":"DESCRIBE EDGE","text":"DESCRIBE EDGE
returns the information about an edge type with the given name in a graph space, such as field names, data type, and so on.
Running the DESCRIBE EDGE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
DESC[RIBE] EDGE <edge_type_name>\n
You can use DESC
instead of DESCRIBE
for short.
nebula> DESCRIBE EDGE follow;\n+----------+---------+-------+---------+---------+\n| Field | Type | Null | Default | Comment |\n+----------+---------+-------+---------+---------+\n| \"degree\" | \"int64\" | \"YES\" | | |\n+----------+---------+-------+---------+---------+\n
"},{"location":"3.ngql-guide/12.vertex-statements/1.insert-vertex/","title":"INSERT VERTEX","text":"The INSERT VERTEX
statement inserts one or more vertices into a graph space in NebulaGraph.
Running the INSERT VERTEX
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
INSERT VERTEX [IF NOT EXISTS] [tag_props, [tag_props] ...]\nVALUES VID: ([prop_value_list])\n\ntag_props:\n tag_name ([prop_name_list])\n\nprop_name_list:\n [prop_name [, prop_name] ...]\n\nprop_value_list:\n [prop_value [, prop_value] ...] \n
IF NOT EXISTS
detects if the VID that you want to insert exists. If it does not exist, a new one will be inserted.
Note
IF NOT EXISTS
only compares the names of the VID and the tag (excluding properties).IF NOT EXISTS
will read to check whether the data exists, which will have a significant impact on performance.tag_name
denotes the tag (vertex type), which must be created before INSERT VERTEX
. For more information, see CREATE TAG.
Caution
NebulaGraph master supports inserting vertices without tags.
Compatibility
In NebulaGraph master, inserting vertex without tag is not supported by default. If you want to use the vertex without tags, add --graph_use_vertex_key=true
to the configuration files (nebula-graphd.conf
) of all Graph services in the cluster, add --use_vertex_key=true
to the configuration files (nebula-storaged.conf
) of all Storage services in the cluster. An example of a command to insert a vertex without tag is INSERT VERTEX VALUES \"1\":();
.
prop_name_list
contains the names of the properties on the tag.VID
is the vertex ID. In NebulaGraph 2.0, string and integer VID types are supported. The VID type is set when a graph space is created. For more information, see CREATE SPACE.prop_value_list
must provide the property values according to the prop_name_list
. When the NOT NULL
constraint is set for a given property, an error is returned if no property is given. When the default value for a property is NULL
, you can omit to specify the property value. For details, see CREATE TAG.Caution
INSERT VERTEX
and CREATE
have different semantics.
INSERT VERTEX
is closer to that of INSERT in NoSQL (key-value), or UPSERT
(UPDATE
or INSERT
) in SQL.IF NOT EXISTS
) with the same VID
and TAG
are operated at the same time, the latter INSERT will overwrite the former.VID
but different TAGS
are operated at the same time, the operation of different tags will not overwrite each other.Examples are as follows.
"},{"location":"3.ngql-guide/12.vertex-statements/1.insert-vertex/#examples","title":"Examples","text":"# Insert a vertex without tag.\nnebula> INSERT VERTEX VALUES \"1\":();\n\n# The following examples create tag t1 with no property and inserts vertex \"10\" with no property.\nnebula> CREATE TAG IF NOT EXISTS t1(); \nnebula> INSERT VERTEX t1() VALUES \"10\":(); \n
nebula> CREATE TAG IF NOT EXISTS t2 (name string, age int); \nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n1\", 12);\n\n# In the following example, the insertion fails because \"a13\" is not int.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"12\":(\"n1\", \"a13\"); \n\n# The following example inserts two vertices at one time.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"13\":(\"n3\", 12), \"14\":(\"n4\", 8); \n
nebula> CREATE TAG IF NOT EXISTS t3(p1 int);\nnebula> CREATE TAG IF NOT EXISTS t4(p2 string);\n\n# The following example inserts vertex \"21\" with two tags.\nnebula> INSERT VERTEX t3 (p1), t4(p2) VALUES \"21\": (321, \"hello\");\n
A vertex can be inserted/written with new values multiple times. Only the last written values can be read.
# The following examples insert vertex \"11\" with new values for multiple times.\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n2\", 13);\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n3\", 14);\nnebula> INSERT VERTEX t2 (name, age) VALUES \"11\":(\"n4\", 15);\nnebula> FETCH PROP ON t2 \"11\" YIELD properties(vertex);\n+-----------------------+\n| properties(VERTEX) |\n+-----------------------+\n| {age: 15, name: \"n4\"} |\n+-----------------------+\n
nebula> CREATE TAG IF NOT EXISTS t5(p1 fixed_string(5) NOT NULL, p2 int, p3 int DEFAULT NULL);\nnebula> INSERT VERTEX t5(p1, p2, p3) VALUES \"001\":(\"Abe\", 2, 3);\n\n# In the following example, the insertion fails because the value of p1 cannot be NULL.\nnebula> INSERT VERTEX t5(p1, p2, p3) VALUES \"002\":(NULL, 4, 5);\n[ERROR (-1009)]: SemanticError: No schema found for `t5'\n\n# In the following example, the value of p3 is the default NULL.\nnebula> INSERT VERTEX t5(p1, p2) VALUES \"003\":(\"cd\", 5);\nnebula> FETCH PROP ON t5 \"003\" YIELD properties(vertex);\n+---------------------------------+\n| properties(VERTEX) |\n+---------------------------------+\n| {p1: \"cd\", p2: 5, p3: __NULL__} |\n+---------------------------------+\n\n# In the following example, the allowed maximum length of p1 is 5.\nnebula> INSERT VERTEX t5(p1, p2) VALUES \"004\":(\"shalalalala\", 4);\nnebula> FETCH PROP on t5 \"004\" YIELD properties(vertex);\n+------------------------------------+\n| properties(VERTEX) |\n+------------------------------------+\n| {p1: \"shala\", p2: 4, p3: __NULL__} |\n+------------------------------------+\n
If you insert a vertex that already exists with IF NOT EXISTS
, there will be no modification.
# The following example inserts vertex \"1\".\nnebula> INSERT VERTEX t2 (name, age) VALUES \"1\":(\"n2\", 13);\n# Modify vertex \"1\" with IF NOT EXISTS. But there will be no modification as vertex \"1\" already exists.\nnebula> INSERT VERTEX IF NOT EXISTS t2 (name, age) VALUES \"1\":(\"n3\", 14);\nnebula> FETCH PROP ON t2 \"1\" YIELD properties(vertex);\n+-----------------------+\n| properties(VERTEX) |\n+-----------------------+\n| {age: 13, name: \"n2\"} |\n+-----------------------+\n
"},{"location":"3.ngql-guide/12.vertex-statements/2.update-vertex/","title":"UPDATE VERTEX","text":"The UPDATE VERTEX
statement updates properties on tags of a vertex.
In NebulaGraph, UPDATE VERTEX
supports compare-and-set (CAS).
Note
An UPDATE VERTEX
statement can only update properties on ONE TAG of a vertex.
UPDATE VERTEX ON <tag_name> <vid>\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <tag_name>
Yes Specifies the tag of the vertex. The properties to be updated must be on this tag. ON player
<vid>
Yes Specifies the ID of the vertex to be updated. \"player100\"
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET age = age +1
WHEN <condition>
No Specifies the filter conditions. If <condition>
evaluates to false
, the SET
clause will not take effect. WHEN name == \"Tim\"
YIELD <output>
No Specifies the output format of the statement. YIELD name AS Name
"},{"location":"3.ngql-guide/12.vertex-statements/2.update-vertex/#example","title":"Example","text":"// This query checks the properties of vertex \"player101\".\nnebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 36, name: \"Tony Parker\"} |\n+--------------------------------+\n\n// This query updates the age property and returns name and the new age.\nnebula> UPDATE VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Tony Parker\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/","title":"UPSERT VERTEX","text":"The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT VERTEX
to update the properties of a vertex if it exists or insert a new vertex if it does not exist.
Note
An UPSERT VERTEX
statement can only update the properties on ONE TAG of a vertex.
The performance of UPSERT
is much lower than that of INSERT
because UPSERT
is a read-modify-write serialization operation at the partition level.
Danger
Don't use UPSERT
for scenarios with highly concurrent writes. You can use UPDATE
or INSERT
instead.
UPSERT VERTEX ON <tag> <vid>\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <tag>
Yes Specifies the tag of the vertex. The properties to be updated must be on this tag. ON player
<vid>
Yes Specifies the ID of the vertex to be updated or inserted. \"player100\"
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET age = age +1
WHEN <condition>
No Specifies the filter conditions. WHEN name == \"Tim\"
YIELD <output>
No Specifies the output format of the statement. YIELD name AS Name
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/#insert_a_vertex_if_it_does_not_exist","title":"Insert a vertex if it does not exist","text":"If a vertex does not exist, it is created no matter the conditions in the WHEN
clause are met or not, and the SET
clause always takes effect. The property values of the new vertex depend on:
SET
clause is defined.For example, if:
name
and age
based on the tag player
.SET
clause specifies that age = 30
.Then the property values in different cases are listed as follows:
AreWHEN
conditions met If properties have default values Value of name
Value of age
Yes Yes The default value 30
Yes No NULL
30
No Yes The default value 30
No No NULL
30
Here are some examples:
// This query checks if the following three vertices exist. The result \"Empty set\" indicates that the vertices do not exist.\nnebula> FETCH PROP ON * \"player666\", \"player667\", \"player668\" YIELD properties(vertex);\n+--------------------+\n| properties(VERTEX) |\n+--------------------+\n+--------------------+\nEmpty set\n\nnebula> UPSERT VERTEX ON player \"player666\" \\\n SET age = 30 \\\n WHEN name == \"Joe\" \\\n YIELD name AS Name, age AS Age;\n+----------+----------+\n| Name | Age |\n+----------+----------+\n| __NULL__ | 30 |\n+----------+----------+\n\nnebula> UPSERT VERTEX ON player \"player666\" \\\n SET age = 31 \\\n WHEN name == \"Joe\" \\\n YIELD name AS Name, age AS Age;\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 30 |\n+----------+-----+\n\nnebula> UPSERT VERTEX ON player \"player667\" \\\n SET age = 31 \\\n YIELD name AS Name, age AS Age;\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 31 |\n+----------+-----+\n\nnebula> UPSERT VERTEX ON player \"player668\" \\\n SET name = \"Amber\", age = age + 1 \\\n YIELD name AS Name, age AS Age;\n+---------+----------+\n| Name | Age |\n+---------+----------+\n| \"Amber\" | __NULL__ |\n+---------+----------+\n
In the last query of the preceding examples, since age
has no default value, when the vertex is created, age
is NULL
, and age = age + 1
does not take effect. But if age
has a default value, age = age + 1
will take effect. For example:
nebula> CREATE TAG IF NOT EXISTS player_with_default(name string, age int DEFAULT 20);\nExecution succeeded\n\nnebula> UPSERT VERTEX ON player_with_default \"player101\" \\\n SET age = age + 1 \\\n YIELD name AS Name, age AS Age;\n\n+----------+-----+\n| Name | Age |\n+----------+-----+\n| __NULL__ | 21 |\n+----------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/3.upsert-vertex/#update_a_vertex_if_it_exists","title":"Update a vertex if it exists","text":"If the vertex exists and the WHEN
conditions are met, the vertex is updated.
nebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 36, name: \"Tony Parker\"} |\n+--------------------------------+\n\nnebula> UPSERT VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Tony Parker\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
If the vertex exists and the WHEN
conditions are not met, the update does not take effect.
nebula> FETCH PROP ON player \"player101\" YIELD properties(vertex);\n+--------------------------------+\n| properties(VERTEX) |\n+--------------------------------+\n| {age: 38, name: \"Tony Parker\"} |\n+--------------------------------+\n\nnebula> UPSERT VERTEX ON player \"player101\" \\\n SET age = age + 2 \\\n WHEN name == \"Someone else\" \\\n YIELD name AS Name, age AS Age;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 38 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/12.vertex-statements/4.delete-vertex/","title":"DELETE VERTEX","text":"By default, the DELETE VERTEX
statement deletes vertices but the incoming and outgoing edges of the vertices.
Compatibility
The DELETE VERTEX
statement deletes one vertex or multiple vertices at a time. You can use DELETE VERTEX
together with pipes. For more information about pipe, see Pipe operator.
Note
DELETE VERTEX
deletes vertices directly.DELETE TAG
deletes a tag with the given name on a specified vertex.DELETE VERTEX <vid> [, <vid> ...] [WITH EDGE];\n
This query deletes the vertex whose ID is \"team1\".
# Delete the vertex whose VID is `team1` but the related incoming and outgoing edges are not deleted.\nnebula> DELETE VERTEX \"team1\";\n\n# Delete the vertex whose VID is `team1` and the related incoming and outgoing edges.\nnebula> DELETE VERTEX \"team1\" WITH EDGE;\n
This query shows that you can use DELETE VERTEX
together with pipe to delete vertices.
nebula> GO FROM \"player100\" OVER serve WHERE properties(edge).start_year == \"2021\" YIELD dst(edge) AS id | DELETE VERTEX $-.id;\n
"},{"location":"3.ngql-guide/12.vertex-statements/4.delete-vertex/#process_of_deleting_vertices","title":"Process of deleting vertices","text":"Once NebulaGraph deletes the vertices, all edges (incoming and outgoing edges) of the target vertex will become dangling edges. When NebulaGraph deletes the vertices WITH EDGE
, NebulaGraph traverses the incoming and outgoing edges related to the vertices and deletes them all. Then NebulaGraph deletes the vertices.
Caution
--storage_client_timeout_ms
in nebula-graphd.conf
to extend the timeout period.The INSERT EDGE
statement inserts an edge or multiple edges into a graph space from a source vertex (given by src_vid) to a destination vertex (given by dst_vid) with a specific rank in NebulaGraph.
When inserting an edge that already exists, INSERT EDGE
overrides the edge.
INSERT EDGE [IF NOT EXISTS] <edge_type> ( <prop_name_list> ) VALUES \n<src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> )\n[, <src_vid> -> <dst_vid>[@<rank>] : ( <prop_value_list> ), ...];\n\n<prop_name_list> ::=\n [ <prop_name> [, <prop_name> ] ...]\n\n<prop_value_list> ::=\n [ <prop_value> [, <prop_value> ] ...]\n
IF NOT EXISTS
detects if the edge that you want to insert exists. If it does not exist, a new one will be inserted.
Note
IF NOT EXISTS
only detects whether exist and does not detect whether the property values overlap. IF NOT EXISTS
will read to check whether the data exists, which will have a significant impact on performance.<edge_type>
denotes the edge type, which must be created before INSERT EDGE
. Only one edge type can be specified in this statement.<prop_name_list>
is the property name list in the given <edge_type>
.src_vid
is the VID of the source vertex. It specifies the start of an edge.dst_vid
is the VID of the destination vertex. It specifies the end of an edge.rank
is optional. It specifies the edge rank of the same edge type. The data type is int
. If not specified, the default value is 0
. You can insert many edges with the same edge type, source vertex, and destination vertex by using different rank values.
OpenCypher compatibility
OpenCypher has no such concept as rank.
<prop_value_list>
must provide the value list according to <prop_name_list>
. If the property values do not match the data type in the edge type, an error is returned. When the NOT NULL
constraint is set for a given property, an error is returned if no property is given. When the default value for a property is NULL
, you can omit to specify the property value. For details, see CREATE EDGE.# The following example creates edge type e1 with no property and inserts an edge from vertex \"10\" to vertex \"11\" with no property.\nnebula> CREATE EDGE IF NOT EXISTS e1(); \nnebula> INSERT EDGE e1 () VALUES \"10\"->\"11\":(); \n\n# The following example inserts an edge from vertex \"10\" to vertex \"11\" with no property. The edge rank is 1.\nnebula> INSERT EDGE e1 () VALUES \"10\"->\"11\"@1:(); \n
nebula> CREATE EDGE IF NOT EXISTS e2 (name string, age int); \nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1);\n\n# The following example creates edge type e2 with two properties.\nnebula> INSERT EDGE e2 (name, age) VALUES \\\n \"12\"->\"13\":(\"n1\", 1), \"13\"->\"14\":(\"n2\", 2); \n\n# In the following example, the insertion fails because \"a13\" is not int.\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", \"a13\");\n
An edge can be inserted/written with property values multiple times. Only the last written values can be read.
The following examples insert edge e2 with the new values for multiple times.\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 12);\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 13);\nnebula> INSERT EDGE e2 (name, age) VALUES \"11\"->\"13\":(\"n1\", 14);\nnebula> FETCH PROP ON e2 \"11\"->\"13\" YIELD edge AS e;\n+-------------------------------------------+\n| e |\n+-------------------------------------------+\n| [:e2 \"11\"->\"13\" @0 {age: 14, name: \"n1\"}] |\n+-------------------------------------------+\n
If you insert an edge that already exists with IF NOT EXISTS
, there will be no modification.
# The following example inserts edge e2 from vertex \"14\" to vertex \"15\".\nnebula> INSERT EDGE e2 (name, age) VALUES \"14\"->\"15\"@1:(\"n1\", 12);\n# The following example alters the edge with IF NOT EXISTS. But there will be no alteration because edge e2 already exists.\nnebula> INSERT EDGE IF NOT EXISTS e2 (name, age) VALUES \"14\"->\"15\"@1:(\"n2\", 13);\nnebula> FETCH PROP ON e2 \"14\"->\"15\"@1 YIELD edge AS e;\n+-------------------------------------------+\n| e |\n+-------------------------------------------+\n| [:e2 \"14\"->\"15\" @1 {age: 12, name: \"n1\"}] |\n+-------------------------------------------+\n
Note
<edgetype>._src
or <edgetype>._dst
(which is not recommended).edge conflict
error, so please try again later.The UPDATE EDGE
statement updates properties on an edge.
In NebulaGraph, UPDATE EDGE
supports compare-and-swap (CAS).
UPDATE EDGE ON <edge_type>\n<src_vid> -> <dst_vid> [@<rank>]\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <output>]\n
Parameter Required Description Example ON <edge_type>
Yes Specifies the edge type. The properties to be updated must be on this edge type. ON serve
<src_vid>
Yes Specifies the source vertex ID of the edge. \"player100\"
<dst_vid>
Yes Specifies the destination vertex ID of the edge. \"team204\"
<rank>
No Specifies the rank of the edge. The data type is int
. 10
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET start_year = start_year +1
WHEN <condition>
No Specifies the filter conditions. If <condition>
evaluates to false
, the SET
clause does not take effect. WHEN end_year < 2010
YIELD <output>
No Specifies the output format of the statement. YIELD start_year AS Start_Year
"},{"location":"3.ngql-guide/13.edge-statements/2.update-edge/#example","title":"Example","text":"The following example checks the properties of the edge with the GO statement.
nebula> GO FROM \"player100\" \\\n OVER serve \\\n YIELD properties(edge).start_year, properties(edge).end_year;\n+-----------------------------+---------------------------+\n| properties(EDGE).start_year | properties(EDGE).end_year |\n+-----------------------------+---------------------------+\n| 1997 | 2016 |\n+-----------------------------+---------------------------+\n
The following example updates the start_year
property and returns the end_year
and the new start_year
.
nebula> UPDATE EDGE on serve \"player100\" -> \"team204\"@0 \\\n SET start_year = start_year + 1 \\\n WHEN end_year > 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 1998 | 2016 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/","title":"UPSERT EDGE","text":"The UPSERT
statement is a combination of UPDATE
and INSERT
. You can use UPSERT EDGE
to update the properties of an edge if it exists or insert a new edge if it does not exist.
The performance of UPSERT
is much lower than that of INSERT
because UPSERT
is a read-modify-write serialization operation at the partition level.
Danger
Do not use UPSERT
for scenarios with highly concurrent writes. You can use UPDATE
or INSERT
instead.
UPSERT EDGE ON <edge_type>\n<src_vid> -> <dst_vid> [@rank]\nSET <update_prop>\n[WHEN <condition>]\n[YIELD <properties>]\n
Parameter Required Description Example ON <edge_type>
Yes Specifies the edge type. The properties to be updated must be on this edge type. ON serve
<src_vid>
Yes Specifies the source vertex ID of the edge. \"player100\"
<dst_vid>
Yes Specifies the destination vertex ID of the edge. \"team204\"
<rank>
No Specifies the rank of the edge. 10
SET <update_prop>
Yes Specifies the properties to be updated and how they will be updated. SET start_year = start_year +1
WHEN <condition>
No Specifies the filter conditions. WHEN end_year < 2010
YIELD <output>
No Specifies the output format of the statement. YIELD start_year AS Start_Year
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/#insert_an_edge_if_it_does_not_exist","title":"Insert an edge if it does not exist","text":"If an edge does not exist, it is created no matter the conditions in the WHEN
clause are met or not, and the SET
clause takes effect. The property values of the new edge depend on:
SET
clause is defined.For example, if:
start_year
and end_year
based on the edge type serve
.SET
clause specifies that end_year = 2021
.Then the property values in different cases are listed as follows:
AreWHEN
conditions met If properties have default values Value of start_year
Value of end_year
Yes Yes The default value 2021
Yes No NULL
2021
No Yes The default value 2021
No No NULL
2021
Here are some examples:
// This example checks if the following three vertices have any outgoing serve edge. The result \"Empty set\" indicates that such an edge does not exist.\nnebula> GO FROM \"player666\", \"player667\", \"player668\" \\\n OVER serve \\\n YIELD properties(edge).start_year, properties(edge).end_year;\n+-----------------------------+---------------------------+\n| properties(EDGE).start_year | properties(EDGE).end_year |\n+-----------------------------+---------------------------+\n+-----------------------------+---------------------------+\nEmpty set\n\nnebula> UPSERT EDGE on serve \\\n \"player666\" -> \"team200\"@0 \\\n SET end_year = 2021 \\\n WHEN end_year == 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2021 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player666\" -> \"team200\"@0 \\\n SET end_year = 2022 \\\n WHEN end_year == 2010 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2021 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player667\" -> \"team200\"@0 \\\n SET end_year = 2022 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2022 |\n+------------+----------+\n\nnebula> UPSERT EDGE on serve \\\n \"player668\" -> \"team200\"@0 \\\n SET start_year = 2000, end_year = end_year + 1 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2000 | __NULL__ |\n+------------+----------+\n
In the last query of the preceding example, since end_year
has no default value, when the edge is created, end_year
is NULL
, and end_year = end_year + 1
does not take effect. But if end_year
has a default value, end_year = end_year + 1
will take effect. For example:
nebula> CREATE EDGE IF NOT EXISTS serve_with_default(start_year int, end_year int DEFAULT 2010);\nExecution succeeded\n\nnebula> UPSERT EDGE on serve_with_default \\\n \"player668\" -> \"team200\" \\\n SET end_year = end_year + 1 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| __NULL__ | 2011 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/3.upsert-edge/#update_an_edge_if_it_exists","title":"Update an edge if it exists","text":"If the edge exists and the WHEN
conditions are met, the edge is updated.
nebula> MATCH (v:player{name:\"Ben Simmons\"})-[e:serve]-(v2) \\\n RETURN e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player149\"->\"team219\" @0 {end_year: 2019, start_year: 2016}] |\n+-----------------------------------------------------------------------+\n\nnebula> UPSERT EDGE on serve \\\n \"player149\" -> \"team219\" \\\n SET end_year = end_year + 1 \\\n WHEN start_year == 2016 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2016 | 2020 |\n+------------+----------+\n
If the edge exists and the WHEN
conditions are not met, the update does not take effect.
nebula> MATCH (v:player{name:\"Ben Simmons\"})-[e:serve]-(v2) \\\n RETURN e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player149\"->\"team219\" @0 {end_year: 2020, start_year: 2016}] |\n+-----------------------------------------------------------------------+\n\n\nnebula> UPSERT EDGE on serve \\\n \"player149\" -> \"team219\" \\\n SET end_year = end_year + 1 \\\n WHEN start_year != 2016 \\\n YIELD start_year, end_year;\n+------------+----------+\n| start_year | end_year |\n+------------+----------+\n| 2016 | 2020 |\n+------------+----------+\n
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/","title":"DELETE EDGE","text":"The DELETE EDGE
statement deletes one edge or multiple edges at a time. You can use DELETE EDGE
together with pipe operators. For more information, see PIPE OPERATORS.
To delete all the outgoing edges for a vertex, please delete the vertex. For more information, see DELETE VERTEX.
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/#syntax","title":"Syntax","text":"DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid>[@<rank>] ...]\n
Caution
If no rank is specified, NebulaGraph only deletes the edge with rank 0. Delete edges with all ranks, as shown in the following example.
"},{"location":"3.ngql-guide/13.edge-statements/4.delete-edge/#examples","title":"Examples","text":"nebula> DELETE EDGE serve \"player100\" -> \"team204\"@0;\n
The following example shows that you can use DELETE EDGE
together with pipe operators to delete edges that meet the conditions.
nebula> GO FROM \"player100\" OVER follow \\\n WHERE dst(edge) == \"player101\" \\\n YIELD src(edge) AS src, dst(edge) AS dst, rank(edge) AS rank \\\n | DELETE EDGE follow $-.src->$-.dst @ $-.rank;\n
"},{"location":"3.ngql-guide/14.native-index-statements/","title":"Index overview","text":"Indexes are built to fast process graph queries. Nebula\u00a0Graph supports two kinds of indexes: native indexes and full-text indexes. This topic introduces the index types and helps choose the right index.
"},{"location":"3.ngql-guide/14.native-index-statements/#usage_instructions","title":"Usage Instructions","text":"LOOKUP
statement. If there is no index, an error will be reported when executing the LOOKUP
statement.ID numbers
is 1
), can significantly improve query performance. For indexes with low selectivity (such as country
), query performance might not experience a substantial improvement.Native indexes allow querying data based on a given property. Features are as follows.
REBUILD INDEX
statement to update native indexes.Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property. Features are as follows.
AND
, OR
, and NOT
.Note
To do complete string matches, use native indexes.
"},{"location":"3.ngql-guide/14.native-index-statements/#null_values","title":"Null values","text":"Indexes do not support indexing null values.
"},{"location":"3.ngql-guide/14.native-index-statements/#range_queries","title":"Range queries","text":"In addition to querying single results from native indexes, you can also do range queries. Not all the native indexes support range queries. You can only do range searches for numeric, date, and time type properties.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/","title":"CREATE INDEX","text":""},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#prerequisites","title":"Prerequisites","text":"Before you create an index, make sure that the relative tag or edge type is created. For how to create tags or edge types, see CREATE TAG and CREATE EDGE.
For how to create full-text indexes, see Deploy full-text index.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#must-read_for_using_indexes","title":"Must-read for using indexes","text":"The concept and using restrictions of indexes are comparatively complex. Before you use indexes, you must read the following sections carefully.
You can use CREATE INDEX
to add native indexes for the existing tags, edge types, or properties. They are usually called as tag indexes, edge type indexes, and property indexes.
LOOKUP
to retrieve all the vertices with the tag player
.age
property to retrieve the VID of all vertices that meet age == 19
.If a property index i_TA
is created for the property A
of the tag T
and i_T
for the tag T
, the indexes can be replaced as follows (the same for edge type indexes):
i_TA
to replace i_T
.In the MATCH
and LOOKUP
statements, i_T
may replace i_TA
for querying properties.
Legacy version compatibility
In previous releases, the tag or edge type index in the LOOKUP
statement cannot replace the property index for property queries.
Although the same results can be obtained by using alternative indexes for queries, the query performance varies according to the selected index.
Caution
Indexes can dramatically reduce the write performance. The performance can be greatly reduced. DO NOT use indexes in production environments unless you are fully aware of their influences on your service.
Long indexes decrease the scan performance of the Storage Service and use more memory. We suggest that you set the indexing length the same as that of the longest string to be indexed. For variable-length string-type properties, the longest index length is 256 bytes; for fixed-length string-type properties, the longest index length is the length of the index itself.
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#steps","title":"Steps","text":"If you must use indexes, we suggest that you:
Import the data into NebulaGraph.
Create indexes.
Rebuild indexes.
After the index is created and the data is imported, you can use LOOKUP or MATCH to retrieve the data. You do not need to specify which indexes to use in a query, NebulaGraph figures that out by itself.
Note
If you create an index before importing the data, the importing speed will be extremely slow due to the reduction in the write performance.
Keep --disable_auto_compaction = false
during daily incremental writing.
The newly created index will not take effect immediately. Trying to use a newly created index (such as LOOKUP
orREBUILD INDEX
) may fail and return can't find xxx in the space
because the creation is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
in the configuration files for all services.
Danger
After creating a new index, or dropping the old index and creating a new one with the same name again, you must REBUILD INDEX
. Otherwise, these data cannot be returned in the MATCH
and LOOKUP
statements.
CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT '<comment>'];\n
Parameter Description TAG | EDGE
Specifies the index type that you want to create. IF NOT EXISTS
Detects if the index that you want to create exists. If it does not exist, a new one will be created. <index_name>
1. The name of the index. It must be unique in a graph space. A recommended way of naming is i_tagName_propName
. 2. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number.3. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words.Note:1. If you name an index in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in an index name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <tag_name> | <edge_name>
Specifies the name of the tag or edge associated with the index. <prop_name_list>
To index a variable-length string property, you must use prop_name(length)
to specify the index length, and the maximum index length is 256. To index a tag or an edge type, ignore the prop_name_list
. COMMENT
The remarks of the index. The maximum length is 256 bytes. By default, there will be no comments on an index."},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#create_tagedge_type_indexes","title":"Create tag/edge type indexes","text":"nebula> CREATE TAG INDEX IF NOT EXISTS player_index on player();\n
nebula> CREATE EDGE INDEX IF NOT EXISTS follow_index on follow();\n
After indexing a tag or an edge type, you can use the LOOKUP
statement to retrieve the VID of all vertices with the tag
, or the source vertex ID, destination vertex ID, and ranks
of all edges with the edge type
. For more information, see LOOKUP.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_0 on player(name(10));\n
The preceding example creates an index for the name
property on all vertices carrying the player
tag. This example creates an index using the first 10 characters of the name
property.
# To index a variable-length string property, you need to specify the index length.\nnebula> CREATE TAG IF NOT EXISTS var_string(p1 string);\nnebula> CREATE TAG INDEX IF NOT EXISTS var ON var_string(p1(10));\n\n# To index a fixed-length string property, you do not need to specify the index length.\nnebula> CREATE TAG IF NOT EXISTS fix_string(p1 FIXED_STRING(10));\nnebula> CREATE TAG INDEX IF NOT EXISTS fix ON fix_string(p1);\n
nebula> CREATE EDGE INDEX IF NOT EXISTS follow_index_0 on follow(degree);\n
"},{"location":"3.ngql-guide/14.native-index-statements/1.create-native-index/#create_composite_property_indexes","title":"Create composite property indexes","text":"An index on multiple properties on a tag (or an edge type) is called a composite property index.
nebula> CREATE TAG INDEX IF NOT EXISTS player_index_1 on player(name(10), age);\n
Caution
Creating composite property indexes across multiple tags or edge types is not supported.
Note
NebulaGraph follows the left matching principle to select indexes.
"},{"location":"3.ngql-guide/14.native-index-statements/2.1.show-create-index/","title":"SHOW CREATE INDEX","text":"SHOW CREATE INDEX
shows the statement used when creating a tag or an edge type. It contains detailed information about the index, such as its associated properties.
SHOW CREATE {TAG | EDGE} INDEX <index_name>;\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.1.show-create-index/#examples","title":"Examples","text":"You can run SHOW TAG INDEXES
to list all tag indexes, and then use SHOW CREATE TAG INDEX
to show the information about the creation of the specified index.
nebula> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\n\nnebula> SHOW CREATE TAG INDEX player_index_1;\n+------------------+--------------------------------------------------+\n| Tag Index Name | Create Tag Index |\n+------------------+--------------------------------------------------+\n| \"player_index_1\" | \"CREATE TAG INDEX `player_index_1` ON `player` ( |\n| | `name`(20) |\n| | )\" |\n+------------------+--------------------------------------------------+\n
Edge indexes can be queried through a similar approach.
nebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n+----------------+----------+---------+\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n\nnebula> SHOW CREATE EDGE INDEX follow_index;\n+-----------------+-------------------------------------------------+\n| Edge Index Name | Create Edge Index |\n+-----------------+-------------------------------------------------+\n| \"follow_index\" | \"CREATE EDGE INDEX `follow_index` ON `follow` ( |\n| | )\" |\n+-----------------+-------------------------------------------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.show-native-indexes/","title":"SHOW INDEXES","text":"SHOW INDEXES
shows the defined tag or edge type indexes names in the current graph space.
SHOW {TAG | EDGE} INDEXES\n
"},{"location":"3.ngql-guide/14.native-index-statements/2.show-native-indexes/#examples","title":"Examples","text":"nebula> SHOW TAG INDEXES;\n+------------------+--------------+-----------------+\n| Index Name | By Tag | Columns |\n+------------------+--------------+-----------------+\n| \"fix\" | \"fix_string\" | [\"p1\"] |\n| \"player_index_0\" | \"player\" | [\"name\"] |\n| \"player_index_1\" | \"player\" | [\"name\", \"age\"] |\n| \"var\" | \"var_string\" | [\"p1\"] |\n+------------------+--------------+-----------------+\n\nnebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n
Legacy version compatibility
In NebulaGraph 2.x, the SHOW TAG/EDGE INDEXES
statement only returns Names
.
DESCRIBE INDEX
can get the information about the index with a given name, including the property name (Field) and the property type (Type) of the index.
DESCRIBE {TAG | EDGE} INDEX <index_name>;\n
"},{"location":"3.ngql-guide/14.native-index-statements/3.describe-native-index/#examples","title":"Examples","text":"nebula> DESCRIBE TAG INDEX player_index_0;\n+--------+--------------------+\n| Field | Type |\n+--------+--------------------+\n| \"name\" | \"fixed_string(30)\" |\n+--------+--------------------+\n\nnebula> DESCRIBE TAG INDEX player_index_1;\n+--------+--------------------+\n| Field | Type |\n+--------+--------------------+\n| \"name\" | \"fixed_string(10)\" |\n| \"age\" | \"int64\" |\n+--------+--------------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/4.rebuild-native-index/","title":"REBUILD INDEX","text":"Danger
LOOKUP
and MATCH
to query the data based on the index. If the index is created before any data insertion, there is no need to rebuild the index.You can use REBUILD INDEX
to rebuild the created tag or edge type index. For details on how to create an index, see CREATE INDEX.
Caution
The speed of rebuilding indexes can be optimized by modifying the rebuild_index_part_rate_limit
and snapshot_batch_size
parameters in the configuration file. In addition, greater parameter values may result in higher memory and network usage, see Storage Service configurations for details.
REBUILD {TAG | EDGE} INDEX [<index_name_list>];\n\n<index_name_list>::=\n [index_name [, index_name] ...]\n
REBUILD
statement, separated by commas. When the index name is not specified, all tag or edge indexes are rebuilt.SHOW {TAG | EDGE} INDEX STATUS
command to check if the index is successfully rebuilt. For details on index status, see SHOW INDEX STATUS.nebula> CREATE TAG IF NOT EXISTS person(name string, age int, gender string, email string);\nnebula> CREATE TAG INDEX IF NOT EXISTS single_person_index ON person(name(10));\n\n# The following example rebuilds an index and returns the job ID.\nnebula> REBUILD TAG INDEX single_person_index;\n+------------+\n| New Job Id |\n+------------+\n| 31 |\n+------------+\n\n# The following example checks the index status.\nnebula> SHOW TAG INDEX STATUS;\n+-----------------------+--------------+\n| Name | Index Status |\n+-----------------------+--------------+\n| \"single_person_index\" | \"FINISHED\" |\n+-----------------------+--------------+\n\n# You can also use \"SHOW JOB <job_id>\" to check if the rebuilding process is complete.\nnebula> SHOW JOB 31;\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n| 31 | \"REBUILD_TAG_INDEX\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:24.000 | \"SUCCEEDED\" |\n| 0 | \"storaged1\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| 1 | \"storaged2\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| 2 | \"storaged0\" | \"FINISHED\" | 2021-07-07T09:04:24.000 | 2021-07-07T09:04:28.000 | \"SUCCEEDED\" |\n| \"Total:3\" | \"Succeeded:3\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+---------------------+------------+-------------------------+-------------------------+-------------+\n
NebulaGraph creates a job to rebuild the index. The job ID is displayed in the preceding return message. To check if the rebuilding process is complete, use the SHOW JOB <job_id>
statement. For more information, see SHOW JOB.
SHOW INDEX STATUS
returns the name of the created tag or edge type index and its status of job.
The status of rebuilding indexes includes:
QUEUE
: The job is in a queue.RUNNING
: The job is running.FINISHED
: The job is finished.FAILED
: The job has failed.STOPPED
: The job has stopped.INVALID
: The job is invalid.Note
For details on how to create an index, see CREATE INDEX.
"},{"location":"3.ngql-guide/14.native-index-statements/5.show-native-index-status/#syntax","title":"Syntax","text":"SHOW {TAG | EDGE} INDEX STATUS;\n
"},{"location":"3.ngql-guide/14.native-index-statements/5.show-native-index-status/#example","title":"Example","text":"nebula> SHOW TAG INDEX STATUS;\n+----------------------+--------------+\n| Name | Index Status |\n+----------------------+--------------+\n| \"player_index_0\" | \"FINISHED\" |\n| \"player_index_1\" | \"FINISHED\" |\n+----------------------+--------------+\n
"},{"location":"3.ngql-guide/14.native-index-statements/6.drop-native-index/","title":"DROP INDEX","text":"DROP INDEX
removes an existing index from the current graph space.
Running the DROP INDEX
statement requires some privileges of DROP TAG INDEX
and DROP EDGE INDEX
in the given graph space. Otherwise, NebulaGraph throws an error.
DROP {TAG | EDGE} INDEX [IF EXISTS] <index_name>;\n
IF EXISTS
: Detects whether the index that you want to drop exists. If it exists, it will be dropped.
nebula> DROP TAG INDEX player_index_0;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/","title":"Full-text indexes","text":"Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property.
You can use the WHERE
clause to specify the search strings in LOOKUP
statements.
Before using the full-text index, make sure that you have deployed a Elasticsearch cluster and a Listener cluster. For more information, see Deploy Elasticsearch and Deploy Listener.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#precaution","title":"Precaution","text":"Before using the full-text index, make sure that you know the restrictions.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#full_text_queries","title":"Full Text Queries","text":"Full-text queries enable you to search for parsed text fields, using a parser with strict syntax to return content based on the query string provided. For details, see Query string query.
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#syntax","title":"Syntax","text":""},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#create_full-text_indexes","title":"Create full-text indexes","text":"CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER=\"<analyzer_name>\"];\n
<analyzer_name>
is the name of the analyzer. The default value is standard
. To use other analyzers (e.g. IK Analysis), you need to make sure that the corresponding analyzer is installed in Elasticsearch in advance.SHOW FULLTEXT INDEXES;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#rebuild_full-text_indexes","title":"Rebuild full-text indexes","text":"REBUILD FULLTEXT INDEX;\n
Caution
When there is a large amount of data, rebuilding full-text index is slow, you can modify snapshot_send_files=false
in the configuration file of Storage service(nebula-storaged.conf
).
DROP FULLTEXT INDEX <index_name>;\n
"},{"location":"3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index/#use_query_options","title":"Use query options","text":"LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, \"<text>\") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>];\n\n<return_list>\n <prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...] [, id(vertex) [AS <prop_alias>]] [, score() AS <score_alias>]\n
index_name
: The name of the full-text index.text
: Search conditions. The where can only be followed by the ES_QUERY, and all judgment conditions must be written in the text. For supported syntax, see Query string syntax.score()
: The score calculated by doing N degree expansion for the eligible vertices. The default value is 1.0
. The higher the score, the higher the degree of match. The return value is sorted by default from highest to lowest score. For details, see Search and Scoring in Lucene.// This example creates the graph space.\nnebula> CREATE SPACE IF NOT EXISTS basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));\n\n// This example signs in the text service.\nnebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);\n\n// This example checks the text service status.\nnebula> SHOW TEXT SEARCH CLIENTS;\n+-----------------+-----------------+------+\n| Type | Host | Port |\n+-----------------+-----------------+------+\n| \"ELASTICSEARCH\" | \"192.168.8.100\" | 9200 |\n+-----------------+-----------------+------+\n\n// This example switches the graph space.\nnebula> USE basketballplayer;\n\n// This example adds the listener to the NebulaGraph cluster.\nnebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789;\n\n// This example checks the listener status. When the status is `Online`, the listener is ready.\nnebula> SHOW LISTENER;\n+--------+-----------------+------------------------+-------------+\n| PartId | Type | Host | Host Status |\n+--------+-----------------+------------------------+-------------+\n| 1 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 2 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 3 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n+--------+-----------------+------------------------+-------------+\n\n// This example creates the tag.\nnebula> CREATE TAG IF NOT EXISTS player(name string, city string);\n\n// This example creates a single-attribute full-text index.\nnebula> CREATE FULLTEXT TAG INDEX fulltext_index_1 ON player(name) ANALYZER=\"standard\";\n\n// This example creates a multi-attribute full-text indexe.\nnebula> CREATE FULLTEXT TAG INDEX fulltext_index_2 ON player(name,city) ANALYZER=\"standard\";\n\n// This example rebuilds the full-text index.\nnebula> REBUILD FULLTEXT INDEX;\n\n// This example shows the full-text index.\nnebula> SHOW FULLTEXT INDEXES;\n+--------------------+-------------+-------------+--------------+------------+\n| Name | Schema Type | Schema Name | Fields | Analyzer |\n+--------------------+-------------+-------------+--------------+------------+\n| \"fulltext_index_1\" | \"Tag\" | \"player\" | \"name\" | \"standard\" |\n| \"fulltext_index_2\" | \"Tag\" | \"player\" | \"name, city\" | \"standard\" |\n+--------------------+-------------+-------------+--------------+------------+\n\n// This example inserts the test data.\nnebula> INSERT VERTEX player(name, city) VALUES \\\n \"Russell Westbrook\": (\"Russell Westbrook\", \"Los Angeles\"), \\\n \"Chris Paul\": (\"Chris Paul\", \"Houston\"),\\\n \"Boris Diaw\": (\"Boris Diaw\", \"Houston\"),\\\n \"David West\": (\"David West\", \"Philadelphia\"),\\\n \"Danny Green\": (\"Danny Green\", \"Philadelphia\"),\\\n \"Tim Duncan\": (\"Tim Duncan\", \"New York\"),\\\n \"James Harden\": (\"James Harden\", \"New York\"),\\\n \"Tony Parker\": (\"Tony Parker\", \"Chicago\"),\\\n \"Aron Baynes\": (\"Aron Baynes\", \"Chicago\"),\\\n \"Ben Simmons\": (\"Ben Simmons\", \"Phoenix\"),\\\n \"Blake Griffin\": (\"Blake Griffin\", \"Phoenix\");\n\n// These examples run test queries.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Chris\") YIELD id(vertex);\n+--------------+\n| id(VERTEX) |\n+--------------+\n| \"Chris Paul\" |\n+--------------+\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Harden\") YIELD properties(vertex);\n+----------------------------------------------------------------+\n| properties(VERTEX) |\n+----------------------------------------------------------------+\n| {_vid: \"James Harden\", city: \"New York\", name: \"James Harden\"} |\n+----------------------------------------------------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"Da*\") YIELD properties(vertex);\n+------------------------------------------------------------------+\n| properties(VERTEX) |\n+------------------------------------------------------------------+\n| {_vid: \"David West\", city: \"Philadelphia\", name: \"David West\"} |\n| {_vid: \"Danny Green\", city: \"Philadelphia\", name: \"Danny Green\"} |\n+------------------------------------------------------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex);\n+---------------------+\n| id(VERTEX) |\n+---------------------+\n| \"Russell Westbrook\" |\n| \"Boris Diaw\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Blake Griffin\" |\n+---------------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex) | LIMIT 2,3;\n+-----------------+\n| id(VERTEX) |\n+-----------------+\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Blake Griffin\" |\n+-----------------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex) | YIELD count(*);\n+----------+\n| count(*) |\n+----------+\n| 5 |\n+----------+\n\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*\") YIELD id(vertex), score() AS score;\n+---------------------+-------+\n| id(VERTEX) | score |\n+---------------------+-------+\n| \"Russell Westbrook\" | 1.0 |\n| \"Boris Diaw\" | 1.0 |\n| \"Aron Baynes\" | 1.0 |\n| \"Ben Simmons\" | 1.0 |\n| \"Blake Griffin\" | 1.0 |\n+---------------------+-------+\n\n// For documents containing a word `b`, its score will be multiplied by a weighting factor of 4, while for documents containing a word `c`, the default weighting factor of 1 is used.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,\"*b*^4 OR *c*\") YIELD id(vertex), score() AS score;\n+---------------------+-------+\n| id(VERTEX) | score |\n+---------------------+-------+\n| \"Russell Westbrook\" | 4.0 |\n| \"Boris Diaw\" | 4.0 |\n| \"Aron Baynes\" | 4.0 |\n| \"Ben Simmons\" | 4.0 |\n| \"Blake Griffin\" | 4.0 |\n| \"Chris Paul\" | 1.0 |\n| \"Tim Duncan\" | 1.0 |\n+---------------------+-------+\n\n// When using a multi-attribute full-text index query, the conditions are matched within all properties of the index.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,\"*h*\") YIELD properties(vertex);\n+------------------------------------------------------------------+\n| properties(VERTEX) |\n+------------------------------------------------------------------+\n| {_vid: \"Chris Paul\", city: \"Houston\", name: \"Chris Paul\"} |\n| {_vid: \"Boris Diaw\", city: \"Houston\", name: \"Boris Diaw\"} |\n| {_vid: \"David West\", city: \"Philadelphia\", name: \"David West\"} |\n| {_vid: \"James Harden\", city: \"New York\", name: \"James Harden\"} |\n| {_vid: \"Tony Parker\", city: \"Chicago\", name: \"Tony Parker\"} |\n| {_vid: \"Aron Baynes\", city: \"Chicago\", name: \"Aron Baynes\"} |\n| {_vid: \"Ben Simmons\", city: \"Phoenix\", name: \"Ben Simmons\"} |\n| {_vid: \"Blake Griffin\", city: \"Phoenix\", name: \"Blake Griffin\"} |\n| {_vid: \"Danny Green\", city: \"Philadelphia\", name: \"Danny Green\"} |\n+------------------------------------------------------------------+\n\n// When using multi-attribute full-text index queries, you can specify different text for different properties for the query.\nnebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,\"name:*b* AND city:Houston\") YIELD properties(vertex);\n+-----------------------------------------------------------+\n| properties(VERTEX) |\n+-----------------------------------------------------------+\n| {_vid: \"Boris Diaw\", city: \"Houston\", name: \"Boris Diaw\"} |\n+-----------------------------------------------------------+\n\n// Delete single-attribute full-text index.\nnebula> DROP FULLTEXT INDEX fulltext_index_1;\n
"},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/","title":"EXPLAIN and PROFILE","text":"EXPLAIN
helps output the execution plan of an nGQL statement without executing the statement.
PROFILE
executes the statement, then outputs the execution plan as well as the execution profile. You can optimize the queries for better performance according to the execution plan and profile.
The execution plan is determined by the execution planner in the NebulaGraph query engine.
The execution planner processes the parsed nGQL statements into actions
. An action
is the smallest unit that can be executed. A typical action
fetches all neighbors of a given vertex, gets the properties of an edge, and filters vertices or edges based on the given conditions. Each action
is assigned to an operator
that performs the action.
For example, a SHOW TAGS
statement is processed into two actions
and assigned to a Start operator
and a ShowTags operator
, while a more complex GO
statement may be processed into more than 10 actions
and assigned to 10 operators.
EXPLAIN
EXPLAIN [format= {\"row\" | \"dot\" | \"tck\"}] <your_nGQL_statement>;\n
PROFILE
PROFILE [format= {\"row\" | \"dot\" | \"tck\"}] <your_nGQL_statement>;\n
The output of an EXPLAIN
or a PROFILE
statement has three formats, the default row
format, the dot
format, and the tck
format. You can use the format
option to modify the output format. Omitting the format
option indicates using the default row
format.
row
format","text":"The row
format outputs the return message in a table as follows.
EXPLAIN
nebula> EXPLAIN format=\"row\" SHOW TAGS;\nExecution succeeded (time spent 327/892 us)\n\nExecution Plan\n\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| id | name | dependencies | profiling data | operator info |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| 1 | ShowTags | 0 | | outputVar: [{\"colNames\":[],\"name\":\"__ShowTags_1\",\"type\":\"DATASET\"}] |\n| | | | | inputVar: |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n| 0 | Start | | | outputVar: [{\"colNames\":[],\"name\":\"__Start_0\",\"type\":\"DATASET\"}] |\n-----+----------+--------------+----------------+----------------------------------------------------------------------\n
PROFILE
nebula> PROFILE format=\"row\" SHOW TAGS;\n+--------+\n| Name |\n+--------+\n| player |\n+--------+\n| team |\n+--------+\nGot 2 rows (time spent 2038/2728 us)\n\nExecution Plan\n\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| id | name | dependencies | profiling data | operator info |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| 1 | ShowTags | 0 | ver: 0, rows: 1, execTime: 42us, totalTime: 1177us | outputVar: [{\"colNames\":[],\"name\":\"__ShowTags_1\",\"type\":\"DATASET\"}] |\n| | | | | inputVar: |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n| 0 | Start | | ver: 0, rows: 0, execTime: 1us, totalTime: 57us | outputVar: [{\"colNames\":[],\"name\":\"__Start_0\",\"type\":\"DATASET\"}] |\n-----+----------+--------------+----------------------------------------------------+----------------------------------------------------------------------\n
The descriptions are as follows.
Parameter Descriptionid
The ID of the operator
. name
The name of the operator
. dependencies
The ID of the operator
that the current operator
depends on. profiling data
The content of the execution profile. ver
is the version of the operator
. rows
shows the number of rows to be output by the operator
. execTime
shows the execution time of action
. totalTime
is the sum of the execution time, the system scheduling time, and the queueing time. operator info
The detailed information of the operator
."},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/#the_dot_format","title":"The dot
format","text":"You can use the format=\"dot\"
option to output the return message in the dot
language, and then use Graphviz to generate a graph of the plan.
Note
Graphviz is open source graph visualization software. Graphviz provides an online tool for previewing DOT language files and exporting them to other formats such as SVG or JSON. For more information, see Graphviz Online.
nebula> EXPLAIN format=\"dot\" SHOW TAGS;\nExecution succeeded (time spent 161/665 us)\nExecution Plan\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n plan\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n digraph exec_plan {\n rankdir=LR;\n \"ShowTags_0\"[label=\"ShowTags_0|outputVar: \\[\\{\\\"colNames\\\":\\[\\],\\\"name\\\":\\\"__ShowTags_0\\\",\\\"type\\\":\\\"DATASET\\\"\\}\\]\\l|inputVar:\\l\", shape=Mrecord];\n \"Start_2\"->\"ShowTags_0\";\n \"Start_2\"[label=\"Start_2|outputVar: \\[\\{\\\"colNames\\\":\\[\\],\\\"name\\\":\\\"__Start_2\\\",\\\"type\\\":\\\"DATASET\\\"\\}\\]\\l|inputVar: \\l\", shape=Mrecord];\n }\n--------------------------------------------------------------------------------------------------------------------------------------------- -------------\n
The Graphviz graph transformed from the above DOT statement is as follows.
"},{"location":"3.ngql-guide/17.query-tuning-statements/1.explain-and-profile/#the_tck_format","title":"Thetck
format","text":"The tck format is similar to a table, but without borders and dividing lines between rows. You can use the results as test cases for unit testing. For information on tck format test cases, see TCK cases.
EXPLAIN
nebula> EXPLAIN format=\"tck\" FETCH PROP ON player \"player_1\",\"player_2\",\"player_3\" YIELD properties(vertex).name as name, properties(vertex).age as age;\nExecution succeeded (time spent 261\u00b5s/613.718\u00b5s)\nExecution Plan (optimize time 28 us)\n| id | name | dependencies | profiling data | operator info |\n| 2 | Project | 1 | | |\n| 1 | GetVertices | 0 | | |\n| 0 | Start | | | |\n\nWed, 22 Mar 2023 23:15:52 CST\n
PROFILE
nebula> PROFILE format=\"tck\" FETCH PROP ON player \"player_1\",\"player_2\",\"player_3\" YIELD properties(vertex).name as name, properties(vertex).age as age;\n| name | age |\n| \"Piter Park\" | 24 |\n| \"aaa\" | 24 |\n| \"ccc\" | 24 |\nGot 3 rows (time spent 1.474ms/2.19677ms)\nExecution Plan (optimize time 41 us)\n| id | name | dependencies | profiling data | operator info |\n| 2 | Project | 1 | {\"rows\":3,\"version\":0} | |\n| 1 | GetVertices | 0 | {\"resp[0]\":{\"exec\":\"232(us)\",\"host\":\"127.0.0.1:9779\",\"total\":\"758(us)\"},\"rows\":3,\"total_rpc\":\"875(us)\",\"version\":0} | |\n| 0 | Start | | {\"rows\":0,\"version\":0} | |\nWed, 22 Mar 2023 23:16:13 CST\n
The KILL SESSION
command is to terminate running sessions.
Note
root
user can terminate sessions.KILL SESSION
command, all Graph services synchronize the latest session information after 2* session_reclaim_interval_secs
seconds (120
seconds by default).You can run the KILL SESSION
command to terminate one or multiple sessions. The syntax is as follows:
To terminate one session
KILL {SESSION|SESSIONS} <SessionId>\n
{SESSION|SESSIONS}
: SESSION
or SESSIONS
, both are supported. <SessionId>
: Specifies the ID of one session. You can run the SHOW SESSIONS command to view the IDs of sessions.To terminate multiple sessions
SHOW SESSIONS \n| YIELD $-.SessionId AS sid [WHERE <filter_clause>]\n| KILL {SESSION|SESSIONS} $-.sid\n
Note
The KILL SESSION
command supports the pipeline operation, combining the SHOW SESSIONS
command with the KILL SESSION
command to terminate multiple sessions.
[WHERE <filter_clause>]
\uff1aWHERE
clause is used to filter sessions. <filter_expression>
specifies a session filtering expression, for example, WHERE $-.CreateTime < datetime(\"2022-12-14T18:00:00\")
. If the WHERE
clause is not specified, all sessions are terminated.WHERE
clause include: SessionId
, UserName
, SpaceName
, CreateTime
, UpdateTime
, GraphAddr
, Timezone
, and ClientIp
. You can run the SHOW SESSIONS command to view descriptions of these conditions.{SESSION|SESSIONS}
: SESSION
or SESSIONS
, both are supported.Caution
Please use filtering conditions with caution to avoid deleting sessions by mistake.
To terminate one session
nebula> KILL SESSION 1672887983842984 \n
To terminate multiple sessions
Terminate all sessions whose creation time is less than 2023-01-05T18:00:00
.
nebula> SHOW SESSIONS | YIELD $-.SessionId AS sid WHERE $-.CreateTime < datetime(\"2023-01-05T18:00:00\") | KILL SESSIONS $-.sid\n
Terminates the two sessions with the earliest creation times.
nebula> SHOW SESSIONS | YIELD $-.SessionId AS sid, $-.CreateTime as CreateTime | ORDER BY $-.CreateTime ASC | LIMIT 2 | KILL SESSIONS $-.sid\n
Terminates all sessions created by the username session_user1
.
nebula> SHOW SESSIONS | YIELD $-.SessionId as sid WHERE $-.UserName == \"session_user1\" | KILL SESSIONS $-.sid\n
Terminate all sessions.
nebula> SHOW SESSIONS | YIELD $-.SessionId as sid | KILL SESSION $-.sid\n\n// Or\nnebula> SHOW SESSIONS | KILL SESSIONS $-.SessionId\n
Caution
When you terminate all sessions, the current session is terminated. Please use it with caution.
KILL QUERY
can terminate the query being executed, and is often used to terminate slow queries.
Note
Users with the God role can kill any query. Other roles can only kill their own queries.
"},{"location":"3.ngql-guide/17.query-tuning-statements/6.kill-query/#syntax","title":"Syntax","text":"KILL QUERY (session=<session_id>, plan=<plan_id>);\n
session_id
: The ID of the session.plan_id
: The ID of the execution plan.The ID of the session and the ID of the execution plan can uniquely determine a query. Both can be obtained through the SHOW QUERIES statement.
"},{"location":"3.ngql-guide/17.query-tuning-statements/6.kill-query/#examples","title":"Examples","text":"This example executes KILL QUERY
in one session to terminate the query in another session.
nebula> KILL QUERY(SESSION=1625553545984255,PLAN=163);\n
The query will be terminated and the following information will be returned.
[ERROR (-1005)]: ExecutionPlanId[1001] does not exist in current Session.\n
"},{"location":"3.ngql-guide/3.data-types/1.numeric/","title":"Numeric types","text":"nGQL supports both integer and floating-point number.
"},{"location":"3.ngql-guide/3.data-types/1.numeric/#integer","title":"Integer","text":"Signed 64-bit integer (INT64), 32-bit integer (INT32), 16-bit integer (INT16), and 8-bit integer (INT8) are supported.
Type Declared keywords Range INT64INT64
orINT
-9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807 INT32 INT32
-2,147,483,648 ~ 2,147,483,647 INT16 INT16
-32,768 ~ 32,767 INT8 INT8
-128 ~ 127"},{"location":"3.ngql-guide/3.data-types/1.numeric/#floating-point_number","title":"Floating-point number","text":"Both single-precision floating-point format (FLOAT) and double-precision floating-point format (DOUBLE) are supported.
Type Declared keywords Range Precision FLOATFLOAT
3.4E +/- 38 6~7 bits DOUBLE DOUBLE
1.7E +/- 308 15~16 bits Scientific notation is also supported, such as 1e2
, 1.1e2
, .3e4
, 1.e4
, and -1234E-10
.
Note
The data type of DECIMAL in MySQL is not supported.
"},{"location":"3.ngql-guide/3.data-types/1.numeric/#reading_and_writing_of_data_values","title":"Reading and writing of data values","text":"When writing and reading different types of data, nGQL complies with the following rules:
Data type Set as VID Set as property Resulted data type INT64 Supported Supported INT64 INT32 Not supported Supported INT64 INT16 Not supported Supported INT64 INT8 Not supported Supported INT64 FLOAT Not supported Supported DOUBLE DOUBLE Not supported Supported DOUBLEFor example, nGQL does not support setting VID as INT8, but supports setting a certain property type of TAG or Edge type as INT8. When using the nGQL statement to read the property of INT8, the resulted type is INT64.
Multiple formats are supported:
123456
.0x1e240
.0361100
.However, NebulaGraph will parse the written non-decimal value into a decimal value and save it. The value read is decimal.
For example, the type of the property score
is INT
. The value of 0xb
is assigned to it through the INSERT statement. If querying the property value with statements such as FETCH, you will get the result 11
, which is the decimal result of the hexadecimal 0xb
.
Geography is a data type composed of latitude and longitude that represents geospatial information. NebulaGraph currently supports Point, LineString, and Polygon in Simple Features and some functions in SQL-MM 3, such as part of the core geo parsing, construction, formatting, conversion, predicates, and dimensions.
"},{"location":"3.ngql-guide/3.data-types/10.geography/#type_description","title":"Type description","text":"A point is the basic data type of geography, which is determined by a latitude and a longitude. For example, \"POINT(3 8)\"
means that the longitude is 3\u00b0
and the latitude is 8\u00b0
. Multiple points can form a linestring or a polygon.
Note
You cannot directly insert geographic data of the following types, such as INSERT VERTEX any_shape(geo) VALUES \"1\":(\"POINT(1 1)\")
. Instead, you need to use a geography function to specify the data type before inserting, such as INSERT VERTEX any_shape(geo) VALUES \"1\":(ST_GeogFromText(\"POINT(1 1)\"));
.
\"POINT(3 8)\"
Specifies the data type as a point. LineString \"LINESTRING(3 8, 4.7 73.23)\"
Specifies the data type as a linestring. Polygon \"POLYGON((0 1, 1 2, 2 3, 0 1))\"
Specifies the data type as a polygon."},{"location":"3.ngql-guide/3.data-types/10.geography/#examples","title":"Examples","text":"//Create a Tag to allow storing any geography data type.\nnebula> CREATE TAG IF NOT EXISTS any_shape(geo geography);\n\n//Create a Tag to allow storing a point only.\nnebula> CREATE TAG IF NOT EXISTS only_point(geo geography(point));\n\n//Create a Tag to allow storing a linestring only.\nnebula> CREATE TAG IF NOT EXISTS only_linestring(geo geography(linestring));\n\n//Create a Tag to allow storing a polygon only.\nnebula> CREATE TAG IF NOT EXISTS only_polygon(geo geography(polygon));\n\n//Create an Edge type to allow storing any geography data type.\nnebula> CREATE EDGE IF NOT EXISTS any_shape_edge(geo geography);\n\n//Create a vertex to store the geography of a polygon.\nnebula> INSERT VERTEX any_shape(geo) VALUES \"103\":(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n\n//Create an edge to store the geography of a polygon.\nnebula> INSERT EDGE any_shape_edge(geo) VALUES \"201\"->\"302\":(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n\n//Query the geography of Vertex 103.\nnebula> FETCH PROP ON any_shape \"103\" YIELD ST_ASText(any_shape.geo);\n+---------------------------------+\n| ST_ASText(any_shape.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n\n//Query the geography of the edge which traverses from Vertex 201 to Vertex 302.\nnebula> FETCH PROP ON any_shape_edge \"201\"->\"302\" YIELD ST_ASText(any_shape_edge.geo);\n+---------------------------------+\n| ST_ASText(any_shape_edge.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n\n//Create an index for the geography of the Tag any_shape and run LOOKUP.\nnebula> CREATE TAG INDEX IF NOT EXISTS any_shape_geo_index ON any_shape(geo);\nnebula> REBUILD TAG INDEX any_shape_geo_index;\nnebula> LOOKUP ON any_shape YIELD ST_ASText(any_shape.geo);\n+---------------------------------+\n| ST_ASText(any_shape.geo) |\n+---------------------------------+\n| \"POLYGON((0 1, 1 2, 2 3, 0 1))\" |\n+---------------------------------+\n
When creating an index for geography properties, you can specify the parameters for the index.
Parameter Default value Descriptions2_max_level
30
The maximum level of S2 cell used in the covering. Allowed values: 1
~30
. Setting it to less than the default means that NebulaGraph will be forced to generate coverings using larger cells. s2_max_cells
8
The maximum number of S2 cells used in the covering. Provides a limit on how much work is done exploring the possible coverings. Allowed values: 1
~30
. You may want to use higher values for odd-shaped regions such as skinny rectangles. Note
Specifying the above two parameters does not affect the Point type of property. The s2_max_level
value of the Point type is forced to be 30
.
nebula> CREATE TAG INDEX IF NOT EXISTS any_shape_geo_index ON any_shape(geo) with (s2_max_level=30, s2_max_cells=8);\n
For more index information, see Index overview.
"},{"location":"3.ngql-guide/3.data-types/2.boolean/","title":"Boolean","text":"A boolean data type is declared with the bool
keyword and can only take the values true
or false
.
nGQL supports using boolean in the following ways:
WHERE
clause.Fixed-length strings and variable-length strings are supported.
"},{"location":"3.ngql-guide/3.data-types/3.string/#declaration_and_literal_representation","title":"Declaration and literal representation","text":"The string type is declared with the keywords of:
STRING
: Variable-length strings.FIXED_STRING(<length>)
: Fixed-length strings. <length>
is the length of the string, such as FIXED_STRING(32)
.A string type is used to store a sequence of characters (text). The literal constant is a sequence of characters of any length surrounded by double or single quotes. For example, \"Hello, Cooper\"
or 'Hello, Cooper'
.
Nebula\u00a0Graph supports using string types in the following ways:
For example:
nebula> CREATE TAG IF NOT EXISTS t1 (p1 FIXED_STRING(10)); \n
nebula> CREATE TAG IF NOT EXISTS t2 (p2 STRING); \n
When the fixed-length string you try to write exceeds the length limit:
In strings, the backslash (\\
) serves as an escape character used to denote special characters.
For example, to include a double quote (\"
) within a string, you cannot directly write \"Hello \"world\"\"
as it leads to a syntax error. Instead, use the backslash (\\
) to escape the double quote, such as \"Hello \\\"world\\\"\"
.
nebula> RETURN \"Hello \\\"world\\\"\"\n+-----------------+\n| \"Hello \"world\"\" |\n+-----------------+\n| \"Hello \"world\"\" |\n+-----------------+\n
The backslash itself needs to be escaped as it's a special character. For example, to include a backslash in a string, you need to write \"Hello \\\\ world\"
.
nebula> RETURN \"Hello \\\\ world\"\n+-----------------+\n| \"Hello \\ world\" |\n+-----------------+\n| \"Hello \\ world\" |\n+-----------------+\n
For more examples of escape characters, see Escape character examples.
"},{"location":"3.ngql-guide/3.data-types/3.string/#opencypher_compatibility","title":"OpenCypher compatibility","text":"There are some tiny differences between openCypher and Cypher, as well as nGQL. The following is what openCypher requires. Single quotes cannot be converted to double quotes.
# File: Literals.feature\nFeature: Literals\n\nBackground:\n Given any graph\n Scenario: Return a single-quoted string\n When executing query:\n \"\"\"\n RETURN '' AS literal\n \"\"\"\n Then the result should be, in any order:\n | literal |\n | '' | # Note: it should return single-quotes as openCypher required.\n And no side effects\n
While Cypher accepts both single quotes and double quotes as the return results. nGQL follows the Cypher way.
nebula > YIELD '' AS quote1, \"\" AS quote2, \"'\" AS quote3, '\"' AS quote4\n+--------+--------+--------+--------+\n| quote1 | quote2 | quote3 | quote4 |\n+--------+--------+--------+--------+\n| \"\" | \"\" | \"'\" | \"\"\" |\n+--------+--------+--------+--------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/","title":"Date and time types","text":"This topic will describe the DATE
, TIME
, DATETIME
, TIMESTAMP
, and DURATION
types.
While inserting time-type property values with DATE
, TIME
, and DATETIME
, NebulaGraph transforms them to a UTC time according to the timezone specified with the timezone_name
parameter in the configuration files.
Note
To change the timezone, modify the timezone_name
value in the configuration files of all NebulaGraph services.
date()
, time()
, and datetime()
can convert a time-type property with a specified timezone. For example, datetime(\"2017-03-04 22:30:40.003000+08:00\")
or datetime(\"2017-03-04T22:30:40.003000[Asia/Shanghai]\")
.date()
, time()
, datetime()
, and timestamp()
all accept empty parameters to return the current date, time, and datetime.date()
, time()
, and datetime()
all accept the property name to return a specific property value of itself. For example, date().month
returns the current month, while time(\"02:59:40\").minute
returns the minutes of the importing time.duration()
to calculate the offset of the moment. Addition and subtraction of date()
and date()
, timestamp()
and timestamp()
are also supported.In nGQL:
localdatetime()
is not supported.YYYY-MM-DDThh:mm:ss
and YYYY-MM-DD hh:mm:ss
.time(\"1:1:1\")
.The DATE
type is used for values with a date part but no time part. Nebula\u00a0Graph retrieves and displays DATE
values in the YYYY-MM-DD
format. The supported range is -32768-01-01
to 32767-12-31
.
The properties of date()
include year
, month
, and day
. date()
supports the input of YYYYY
, YYYYY-MM
or YYYYY-MM-DD
, and defaults to 01
for an untyped month or day.
nebula> RETURN DATE({year:-123, month:12, day:3});\n+------------------------------------+\n| date({year:-(123),month:12,day:3}) |\n+------------------------------------+\n| -123-12-03 |\n+------------------------------------+\n\nnebula> RETURN DATE(\"23333\");\n+---------------+\n| date(\"23333\") |\n+---------------+\n| 23333-01-01 |\n+---------------+\n\nnebula> RETURN DATE(\"2023-12-12\") - DATE(\"2023-12-11\");\n+-----------------------------------------+\n| (date(\"2023-12-12\")-date(\"2023-12-11\")) |\n+-----------------------------------------+\n| 1 |\n+-----------------------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/#time","title":"TIME","text":"The TIME
type is used for values with a time part but no date part. Nebula\u00a0Graph retrieves and displays TIME
values in hh:mm:ss.msmsmsususus
format. The supported range is 00:00:00.000000
to 23:59:59.999999
.
The properties of time()
include hour
, minute
, and second
.
The DATETIME
type is used for values that contain both date and time parts. Nebula\u00a0Graph retrieves and displays DATETIME
values in YYYY-MM-DDThh:mm:ss.msmsmsususus
format. The supported range is -32768-01-01T00:00:00.000000
to 32767-12-31T23:59:59.999999
.
datetime()
include year
, month
, day
, hour
, minute
, and second
.datetime()
can convert TIMESTAMP
to DATETIME
. The value range of TIMESTAMP
is 0~9223372036
.datetime()
supports an int
argument. The int
argument specifies a timestamp.# To get the current date and time.\nnebula> RETURN datetime();\n+----------------------------+\n| datetime() |\n+----------------------------+\n| 2022-08-29T06:37:08.933000 |\n+----------------------------+\n\n# To get the current hour.\nnebula> RETURN datetime().hour;\n+-----------------+\n| datetime().hour |\n+-----------------+\n| 6 |\n+-----------------+\n\n# To get date time from a given timestamp.\nnebula> RETURN datetime(timestamp(1625469277));\n+---------------------------------+\n| datetime(timestamp(1625469277)) |\n+---------------------------------+\n| 2021-07-05T07:14:37.000000 |\n+---------------------------------+\n\nnebula> RETURN datetime(1625469277);\n+----------------------------+\n| datetime(1625469277) |\n+----------------------------+\n| 2021-07-05T07:14:37.000000 |\n+----------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/4.date-and-time/#timestamp","title":"TIMESTAMP","text":"The TIMESTAMP
data type is used for values that contain both date and time parts. It has a range of 1970-01-01T00:00:01
UTC to 2262-04-11T23:47:16
UTC.
TIMESTAMP
has the following features:
1615974839
, which means 2021-03-17T17:53:59
.TIMESTAMP
querying methods: timestamp and timestamp()
function.TIMESTAMP
inserting methods: timestamp, timestamp()
function, and now()
function.timestamp()
function accepts empty arguments to get the current timestamp. It can pass an integer arguments to identify the integer as a timestamp and the range of passed integer is: 0~9223372036
\u3002timestamp()
function can convert DATETIME
to TIMESTAMP
, and the data type of DATETIME
should be a string
. # To get the current timestamp.\nnebula> RETURN timestamp();\n+-------------+\n| timestamp() |\n+-------------+\n| 1625469277 |\n+-------------+\n\n# To get a timestamp from given date and time.\nnebula> RETURN timestamp(\"2022-01-05T06:18:43\");\n+----------------------------------+\n| timestamp(\"2022-01-05T06:18:43\") |\n+----------------------------------+\n| 1641363523 |\n+----------------------------------+\n\n# To get a timestamp using datetime().\nnebula> RETURN timestamp(datetime(\"2022-08-29T07:53:10.939000\"));\n+---------------------------------------------------+\n| timestamp(datetime(\"2022-08-29T07:53:10.939000\")) |\n+---------------------------------------------------+\n| 1661759590 |\n+---------------------------------------------------+ \n
Note
The date and time format string passed into timestamp()
cannot include any millisecond and microsecond, but the date and time format string passed into timestamp(datetime())
can include a millisecond and a microsecond.
The DURATION
data type is used to indicate a period of time. Map data that are freely combined by years
, months
, days
, hours
, minutes
, and seconds
indicates the DURATION
.
DURATION
has the following features:
DURATION
is not supported.DURATION
can be used to calculate the specified time.Create a tag named date1
with three properties: DATE
, TIME
, and DATETIME
.
nebula> CREATE TAG IF NOT EXISTS date1(p1 date, p2 time, p3 datetime);\n
Insert a vertex named test1
.
nebula> INSERT VERTEX date1(p1, p2, p3) VALUES \"test1\":(date(\"2021-03-17\"), time(\"17:53:59\"), datetime(\"2017-03-04T22:30:40.003000[Asia/Shanghai]\"));\n
Query whether the value of property p1
on the test1
tag is 2021-03-17
.
nebula> MATCH (v:date1) RETURN v.date1.p1 == date(\"2021-03-17\");\n+----------------------------------+\n| (v.date1.p1==date(\"2021-03-17\")) |\n+----------------------------------+\n| true |\n+----------------------------------+\n
Return the content of the property p1
on test1
.
nebula> CREATE TAG INDEX IF NOT EXISTS date1_index ON date1(p1);\nnebula> REBUILD TAG INDEX date1_index;\nnebula> MATCH (v:date1) RETURN v.date1.p1;\n+------------------+\n| v.date1.p1.month |\n+------------------+\n| 3 |\n+------------------+\n
Search for vertices with p3
property values less than 2023-01-01T00:00:00.000000
, and return the p3
values.
nebula> MATCH (v:date1) \\\nWHERE v.date1.p3 < datetime(\"2023-01-01T00:00:00.000000\") \\\nRETURN v.date1.p3;\n+----------------------------+\n| v.date1.p3 |\n+----------------------------+\n| 2017-03-04T14:30:40.003000 |\n+----------------------------+\n
Create a tag named school
with the property of TIMESTAMP
.
nebula> CREATE TAG IF NOT EXISTS school(name string , found_time timestamp);\n
Insert a vertex named DUT
with a found-time timestamp of \"1988-03-01T08:00:00\"
.
# Insert as a timestamp. The corresponding timestamp of 1988-03-01T08:00:00 is 573177600, or 573206400 UTC.\nnebula> INSERT VERTEX school(name, found_time) VALUES \"DUT\":(\"DUT\", 573206400);\n\n# Insert in the form of date and time.\nnebula> INSERT VERTEX school(name, found_time) VALUES \"DUT\":(\"DUT\", timestamp(\"1988-03-01T08:00:00\"));\n
Insert a vertex named dut
and store time with now()
or timestamp()
functions.
# Use now() function to store time\nnebula> INSERT VERTEX school(name, found_time) VALUES \"dut\":(\"dut\", now());\n\n# Use timestamp() function to store time\nnebula> INSERT VERTEX school(name, found_time) VALUES \"dut\":(\"dut\", timestamp());\n
You can also use WITH
statement to set a specific date and time, or to perform calculations. For example:
nebula> WITH time({hour: 12, minute: 31, second: 14, millisecond:111, microsecond: 222}) AS d RETURN d;\n+-----------------+\n| d |\n+-----------------+\n| 12:31:14.111222 |\n+-----------------+\n\nnebula> WITH date({year: 1984, month: 10, day: 11}) AS x RETURN x + 1;\n+------------+\n| (x+1) |\n+------------+\n| 1984-10-12 |\n+------------+\n\nnebula> WITH date('1984-10-11') as x, duration({years: 12, days: 14, hours: 99, minutes: 12}) as d \\\n RETURN x + d AS sum, x - d AS diff;\n+------------+------------+\n| sum | diff |\n+------------+------------+\n| 1996-10-29 | 1972-09-23 |\n+------------+------------+\n
"},{"location":"3.ngql-guide/3.data-types/5.null/","title":"NULL","text":"You can set the properties for vertices or edges to NULL
. Also, you can set the NOT NULL
constraint to make sure that the property values are NOT NULL
. If not specified, the property is set to NULL
by default.
Here is the truth table for AND
, OR
, XOR
, and NOT
.
The comparisons and operations about NULL are different from openCypher. There may be changes later.
"},{"location":"3.ngql-guide/3.data-types/5.null/#comparisons_with_null","title":"Comparisons with NULL","text":"The comparison operations with NULL are incompatible with openCypher.
"},{"location":"3.ngql-guide/3.data-types/5.null/#operations_and_return_with_null","title":"Operations and RETURN with NULL","text":"The NULL operations and RETURN with NULL are incompatible with openCypher.
"},{"location":"3.ngql-guide/3.data-types/5.null/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/3.data-types/5.null/#use_not_null","title":"Use NOT NULL","text":"Create a tag named player
. Specify the property name
as NOT NULL
.
nebula> CREATE TAG IF NOT EXISTS player(name string NOT NULL, age int);\n
Use SHOW
to create tag statements. The property name
is NOT NULL
. The property age
is NULL
by default.
nebula> SHOW CREATE TAG player;\n+-----------+-----------------------------------+\n| Tag | Create Tag |\n+-----------+-----------------------------------+\n| \"student\" | \"CREATE TAG `player` ( |\n| | `name` string NOT NULL, |\n| | `age` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+-----------+-----------------------------------+\n
Insert the vertex Kobe
. The property age
can be NULL
.
nebula> INSERT VERTEX player(name, age) VALUES \"Kobe\":(\"Kobe\",null);\n
"},{"location":"3.ngql-guide/3.data-types/5.null/#use_not_null_and_set_the_default","title":"Use NOT NULL and set the default","text":"Create a tag named player
. Specify the property age
as NOT NULL
. The default value is 18
.
nebula> CREATE TAG IF NOT EXISTS player(name string, age int NOT NULL DEFAULT 18);\n
Insert the vertex Kobe
. Specify the property name
only.
nebula> INSERT VERTEX player(name) VALUES \"Kobe\":(\"Kobe\");\n
Query the vertex Kobe
. The property age
is 18
by default.
nebula> FETCH PROP ON player \"Kobe\" YIELD properties(vertex);\n+--------------------------+\n| properties(VERTEX) |\n+--------------------------+\n| {age: 18, name: \"Kobe\"} |\n+--------------------------+\n
"},{"location":"3.ngql-guide/3.data-types/6.list/","title":"Lists","text":"The list is a composite data type. A list is a sequence of values. Individual elements in a list can be accessed by their positions.
A list starts with a left square bracket [
and ends with a right square bracket ]
. A list contains zero, one, or more expressions. List elements are separated from each other with commas (,
). Whitespace around elements is ignored in the list, thus line breaks, tab stops, and blanks can be used for formatting.
A composite data type (i.e. set, map, and list) CANNOT be stored as properties of vertices or edges.
"},{"location":"3.ngql-guide/3.data-types/6.list/#list_operations","title":"List operations","text":"You can use the preset list function to operate the list, or use the index to filter the elements in the list.
"},{"location":"3.ngql-guide/3.data-types/6.list/#index_syntax","title":"Index syntax","text":"[M]\n[M..N]\n[M..]\n[..N]\n
The index of nGQL supports queries from front to back, starting from 0. 0 means the first element, 1 means the second element, and so on. It also supports queries from back to front, starting from -1. -1 means the last element, -2 means the penultimate element, and so on.
greater or equal to M but smaller than N
. Return empty when N
is 0.greater or equal to M
.smaller than N
. Return empty when N
is 0.Note
M
\u2265N
.M
is null, return BAD_TYPE
. When conducting a range query, if M
or N
is null, return null
.# The following query returns the list [1,2,3].\nnebula> RETURN list[1, 2, 3] AS a;\n+-----------+\n| a |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns the element whose index is 3 in the list [1,2,3,4,5]. In a list, the index starts from 0, and thus the return element is 4.\nnebula> RETURN range(1,5)[3];\n+---------------+\n| range(1,5)[3] |\n+---------------+\n| 4 |\n+---------------+\n\n# The following query returns the element whose index is -2 in the list [1,2,3,4,5]. The index of the last element in a list is -1, and thus the return element is 4.\nnebula> RETURN range(1,5)[-2];\n+------------------+\n| range(1,5)[-(2)] |\n+------------------+\n| 4 |\n+------------------+\n\n# The following query returns the elements whose indexes are from 0 to 3 (not including 3) in the list [1,2,3,4,5].\nnebula> RETURN range(1,5)[0..3];\n+------------------+\n| range(1,5)[0..3] |\n+------------------+\n| [1, 2, 3] |\n+------------------+\n\n# The following query returns the elements whose indexes are greater than 2 in the list [1,2,3,4,5].\nnebula> RETURN range(1,5)[3..] AS a;\n+--------+\n| a |\n+--------+\n| [4, 5] |\n+--------+\n\n# The following query returns the elements whose indexes are smaller than 3.\nnebula> WITH list[1, 2, 3, 4, 5] AS a \\\n RETURN a[..3] AS r;\n+-----------+\n| r |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query filters the elements whose indexes are greater than 2 in the list [1,2,3,4,5], calculate them respectively, and returns them.\nnebula> RETURN [n IN range(1,5) WHERE n > 2 | n + 10] AS a;\n+--------------+\n| a |\n+--------------+\n| [13, 14, 15] |\n+--------------+\n\n# The following query returns the elements from the first to the penultimate (inclusive) in the list [1, 2, 3].\nnebula> YIELD list[1, 2, 3][0..-1] AS a;\n+--------+\n| a |\n+--------+\n| [1, 2] |\n+--------+\n\n# The following query returns the elements from the first (exclusive) to the third backward in the list [1, 2, 3, 4, 5].\nnebula> YIELD list[1, 2, 3, 4, 5][-3..-1] AS a;\n+--------+\n| a |\n+--------+\n| [3, 4] |\n+--------+\n\n# The following query sets the variables and returns the elements whose indexes are 1 and 2.\nnebula> $var = YIELD 1 AS f, 3 AS t; \\\n YIELD list[1, 2, 3][$var.f..$var.t] AS a;\n+--------+\n| a |\n+--------+\n| [2, 3] |\n+--------+\n\n# The following query returns empty because the index is out of bound. It will return normally when the index is within the bound.\nnebula> RETURN list[1, 2, 3, 4, 5] [0..10] AS a;\n+-----------------+\n| a |\n+-----------------+\n| [1, 2, 3, 4, 5] |\n+-----------------+\n\nnebula> RETURN list[1, 2, 3] [-5..5] AS a;\n+-----------+\n| a |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns empty because there is a [0..0].\nnebula> RETURN list[1, 2, 3, 4, 5] [0..0] AS a;\n+----+\n| a |\n+----+\n| [] |\n+----+\n\n# The following query returns empty because of M \u2265 N.\nnebula> RETURN list[1, 2, 3, 4, 5] [3..1] AS a;\n+----+\n| a |\n+----+\n| [] |\n+----+\n\n# When conduct a range query, if `M` or `N` is null, return `null`.\nnebula> WITH list[1,2,3] AS a \\\n RETURN a[0..null] as r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\n# The following query calculates the elements in the list [1,2,3,4,5] respectively and returns them without the list head.\nnebula> RETURN tail([n IN range(1, 5) | 2 * n - 10]) AS a;\n+-----------------+\n| a |\n+-----------------+\n| [-6, -4, -2, 0] |\n+-----------------+\n\n# The following query takes the elements in the list [1,2,3] as true and return.\nnebula> RETURN [n IN range(1, 3) WHERE true | n] AS r;\n+-----------+\n| r |\n+-----------+\n| [1, 2, 3] |\n+-----------+\n\n# The following query returns the length of the list [1,2,3].\nnebula> RETURN size(list[1,2,3]);\n+-------------------+\n| size(list[1,2,3]) |\n+-------------------+\n| 3 |\n+-------------------+\n\n# The following query calculates the elements in the list [92,90] and runs a conditional judgment in a where clause.\nnebula> GO FROM \"player100\" OVER follow WHERE properties(edge).degree NOT IN [x IN [92, 90] | x + $$.player.age] \\\n YIELD dst(edge) AS id, properties(edge).degree AS degree;\n+-------------+--------+\n| id | degree |\n+-------------+--------+\n| \"player101\" | 95 |\n| \"player102\" | 90 |\n+-------------+--------+\n\n# The following query takes the query result of the MATCH statement as the elements in a list. Then it calculates and returns them.\nnebula> MATCH p = (n:player{name:\"Tim Duncan\"})-[:follow]->(m) \\\n RETURN [n IN nodes(p) | n.player.age + 100] AS r;\n+------------+\n| r |\n+------------+\n| [142, 136] |\n| [142, 141] |\n+------------+\n
"},{"location":"3.ngql-guide/3.data-types/6.list/#opencypher_compatibility_1","title":"OpenCypher compatibility","text":"null
when querying a single out-of-bound element. However, in nGQL, return OUT_OF_RANGE
when querying a single out-of-bound element.nebula> RETURN range(0,5)[-12];\n+-------------------+\n| range(0,5)[-(12)] |\n+-------------------+\n| OUT_OF_RANGE |\n+-------------------+\n
A composite data type (i.e., set, map, and list) CAN NOT be stored as properties for vertices or edges.
It is recommended to modify the graph modeling method. The composite data type should be modeled as an adjacent edge of a vertex, rather than its property. Each adjacent edge can be dynamically added or deleted. The rank values of the adjacent edges can be used for sequencing.
[(src)-[]->(m) | m.name]
.The set is a composite data type. A set is a set of values. Unlike a List, values in a set are unordered and each value must be unique.
A set starts with a left curly bracket {
and ends with a right curly bracket }
. A set contains zero, one, or more expressions. Set elements are separated from each other with commas (,
). Whitespace around elements is ignored in the set, thus line breaks, tab stops, and blanks can be used for formatting.
# The following query returns the set {1,2,3}.\nnebula> RETURN set{1, 2, 3} AS a;\n+-----------+\n| a |\n+-----------+\n| {3, 2, 1} |\n+-----------+\n\n# The following query returns the set {1,2}, Because the set does not allow repeating elements, and the order is unordered.\nnebula> RETURN set{1, 2, 1} AS a;\n+--------+\n| a |\n+--------+\n| {2, 1} |\n+--------+\n\n# The following query checks whether the set has the specified element 1.\nnebula> RETURN 1 IN set{1, 2} AS a;\n+------+\n| a |\n+------+\n| true |\n+------+\n\n# The following query counts the number of elements in the set.\nnebula> YIELD size(set{1, 2, 1}) AS a;\n+---+\n| a |\n+---+\n| 2 |\n+---+\n\n# The following query returns a set of target vertex property values.\nnebula> GO FROM \"player100\" OVER follow \\\n YIELD set{properties($$).name,properties($$).age} as a;\n+-----------------------+\n| a |\n+-----------------------+\n| {36, \"Tony Parker\"} |\n| {41, \"Manu Ginobili\"} |\n+-----------------------+\n
"},{"location":"3.ngql-guide/3.data-types/8.map/","title":"Maps","text":"The map is a composite data type. Maps are unordered collections of key-value pairs. In maps, the key is a string. The value can have any data type. You can get the map element by using map['key']
.
A map starts with a left curly bracket {
and ends with a right curly bracket }
. A map contains zero, one, or more key-value pairs. Map elements are separated from each other with commas (,
). Whitespace around elements is ignored in the map, thus line breaks, tab stops, and blanks can be used for formatting.
# The following query returns the simple map.\nnebula> YIELD map{key1: 'Value1', Key2: 'Value2'} as a;\n+----------------------------------+\n| a |\n+----------------------------------+\n| {Key2: \"Value2\", key1: \"Value1\"} |\n+----------------------------------+\n\n# The following query returns the list type map.\nnebula> YIELD map{listKey: [{inner: 'Map1'}, {inner: 'Map2'}]} as a;\n+-----------------------------------------------+\n| a |\n+-----------------------------------------------+\n| {listKey: [{inner: \"Map1\"}, {inner: \"Map2\"}]} |\n+-----------------------------------------------+\n\n# The following query returns the hybrid type map.\nnebula> RETURN map{a: LIST[1,2], b: SET{1,2,1}, c: \"hee\"} as a;\n+----------------------------------+\n| a |\n+----------------------------------+\n| {a: [1, 2], b: {2, 1}, c: \"hee\"} |\n+----------------------------------+\n\n# The following query returns the specified element in a map.\nnebula> RETURN map{a: LIST[1,2], b: SET{1,2,1}, c: \"hee\"}[\"b\"] AS b;\n+--------+\n| b |\n+--------+\n| {2, 1} |\n+--------+\n\n# The following query checks whether the map has the specified key, not support checks whether the map has the specified value yet.\nnebula> RETURN \"a\" IN MAP{a:1, b:2} AS a;\n+------+\n| a |\n+------+\n| true |\n+------+\n
"},{"location":"3.ngql-guide/3.data-types/9.type-conversion/","title":"Type Conversion/Type coercions","text":"Converting an expression of a given type to another type is known as type conversion.
NebulaGraph supports converting expressions explicit to other types. For details, see Type conversion functions.
"},{"location":"3.ngql-guide/3.data-types/9.type-conversion/#examples","title":"Examples","text":"nebula> UNWIND [true, false, 'true', 'false', NULL] AS b \\\n RETURN toBoolean(b) AS b;\n+----------+\n| b |\n+----------+\n| true |\n| false |\n| true |\n| false |\n| __NULL__ |\n+----------+\n\nnebula> RETURN toFloat(1), toFloat('1.3'), toFloat('1e3'), toFloat('not a number');\n+------------+----------------+----------------+-------------------------+\n| toFloat(1) | toFloat(\"1.3\") | toFloat(\"1e3\") | toFloat(\"not a number\") |\n+------------+----------------+----------------+-------------------------+\n| 1.0 | 1.3 | 1000.0 | __NULL__ |\n+------------+----------------+----------------+-------------------------+\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/1.composite-queries/","title":"Composite queries (clause structure)","text":"Composite queries put data from different queries together. They then use filters, group-bys, or sorting before returning the combined return results.
Nebula\u00a0Graph supports three methods to run composite queries (or sub-queries):
|
). The result of the previous query can be used as the input of the next query.In a composite query, do not put together openCypher and native nGQL clauses in one statement. For example, this statement is undefined: MATCH ... | GO ... | YIELD ...
.
MATCH
, RETURN
, WITH
, etc), do not introduce any pipe or semicolons to combine the sub-clauses.FETCH
, GO
, LOOKUP
, etc), you must use pipe or semicolons to combine the sub-clauses.transactional
queries (as in SQL/Cypher)","text":"For example, a query is composed of three sub-queries: A B C
, A | B | C
or A; B; C
. In that A is a read operation, B is a computation operation, and C is a write operation. If any part fails in the execution, the whole result will be undefined. There is no rollback. What is written depends on the query executor.
Note
OpenCypher has no requirement of transaction
.
# Connect multiple queries with clauses.\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})--() \\\n WITH nodes(p) AS n \\\n UNWIND n AS n1 \\\n RETURN DISTINCT n1;\n
# Only return edges.\nnebula> SHOW TAGS; SHOW EDGES;\n\n# Insert multiple vertices.\nnebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42); \\\n INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36); \\\n INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\n
# Connect multiple queries with pipes.\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------+-----------------+\n| Team | Player |\n+-----------+-----------------+\n| \"Spurs\" | \"Tony Parker\" |\n| \"Hornets\" | \"Tony Parker\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------+-----------------+\n
User-defined variables allow passing the result of one statement to another.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/2.user-defined-variables/#opencypher_compatibility","title":"OpenCypher compatibility","text":"In openCypher, when you refer to the vertex, edge, or path of a variable, you need to name it first. For example:
nebula> MATCH (v:player{name:\"Tim Duncan\"}) RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{name: \"Tim Duncan\", age: 42}) |\n+----------------------------------------------------+\n
The user-defined variable in the preceding query is v
.
Caution
In a pattern of a MATCH statement, you cannot use the same edge variable repeatedly. For example, e
cannot be written in the pattern p=(v1)-[e*2..2]->(v2)-[e*2..2]->(v3)
.
User-defined variables are written as $var_name
. The var_name
consists of letters, numbers, or underline characters. Any other characters are not permitted.
The user-defined variables are valid only at the current execution (namely, in this composite query). When the execution ends, the user-defined variables will be automatically expired. The user-defined variables in one statement CANNOT be used in any other clients, executions, or sessions.
You can use user-defined variables in composite queries. Details about composite queries, see Composite queries.
Note
nebula> $var = GO FROM \"player100\" OVER follow YIELD dst(edge) AS id; \\\n GO FROM $var.id OVER serve YIELD properties($$).name AS Team, \\\n properties($^).name AS Player;\n+-----------+-----------------+\n| Team | Player |\n+-----------+-----------------+\n| \"Spurs\" | \"Tony Parker\" |\n| \"Hornets\" | \"Tony Parker\" |\n| \"Spurs\" | \"Manu Ginobili\" |\n+-----------+-----------------+\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/2.user-defined-variables/#set_operations_and_scope_of_user-defined_variables","title":"Set operations and scope of user-defined variables","text":"When assigning variables within a compound statement involving set operations, it is important to enclose the scope of the variable assignment in parentheses. In the example below, the source of the $var
assignment is the results of the output of two INTERSECT
statements.
$var = ( \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id \\\n INTERSECT \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id \\\n ); \\\n GO FROM $var.id OVER follow YIELD follow.degree AS degree\n
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/","title":"Reference to properties","text":"nGQL provides property references to allow you to refer to the properties of the source vertex, the destination vertex, and the edge in the GO
statement, and to refer to the output results of the statement in composite queries. This topic describes how to use these property references in nGQL.
Note
This function applies to native nGQL only.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_references_for_vertexes","title":"Property references for vertexes","text":"Parameter Description$^
Used to get the property of the source vertex. $$
Used to get the property of the destination vertex."},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_reference_syntax","title":"Property reference syntax","text":"$^.<tag_name>.<prop_name> # Source vertex property reference\n$$.<tag_name>.<prop_name> # Destination vertex property reference\n
tag_name
: The tag name of the vertex.prop_name
: The property name within the tag._src
The source vertex ID of the edge _dst
The destination vertex ID of the edge _type
The internal encoding of edge types that uses sign to indicate direction. Positive numbers represent forward edges, while negative numbers represent backward edges. _rank
The rank value for the edge"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#property_reference_syntax_1","title":"Property reference syntax","text":"nGQL allows you to reference edge properties, including user-defined edge properties and four built-in edge properties.
<edge_type>.<prop_name> # User-defined edge property reference\n<edge_type>._src|_dst|_type|_rank # Built-in edge property reference\n
edge_type
: The edge type.prop_name
: The property name within the edge type.$-
Used to get the output results of the statement before the pipe in the composite query. For more information, see Pipe."},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_vertexes","title":"Use property references for vertexes","text":"The following query returns the name
property of the player
tag on the source vertex and the age
property of the player
tag on the destination vertex.
nebula> GO FROM \"player100\" OVER follow YIELD $^.player.name AS startName, $$.player.age AS endAge;\n+--------------+--------+\n| startName | endAge |\n+--------------+--------+\n| \"Tim Duncan\" | 36 |\n| \"Tim Duncan\" | 41 |\n+--------------+--------+\n
Legacy version compatibility
Starting from NebulaGraph 2.6.0, Schema-related functions are supported. The preceding example can be rewritten as follows in NebulaGraph master to produce the same results:
GO FROM \"player100\" OVER follow YIELD properties($^).name AS startName, properties($$).age AS endAge;\n
NebulaGraph master is compatible with both new and old syntax.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_edges","title":"Use property references for edges","text":"The following query returns the degree
property of the edge type follow
.
nebula> GO FROM \"player100\" OVER follow YIELD follow.degree;\n+---------------+\n| follow.degree |\n+---------------+\n| 95 |\n+---------------+\n
The following query returns the source vertex, the destination vertex, the edge type, and the edge rank value of the edge type follow
.
nebula> GO FROM \"player100\" OVER follow YIELD follow._src, follow._dst, follow._type, follow._rank;\n+-------------+-------------+--------------+--------------+\n| follow._src | follow._dst | follow._type | follow._rank |\n+-------------+-------------+--------------+--------------+\n| \"player100\" | \"player101\" | 17 | 0 |\n| \"player100\" | \"player125\" | 17 | 0 |\n+-------------+-------------+--------------+--------------+\n
Legacy version compatibility
Starting from NebulaGraph 2.6.0, Schema-related functions are supported. The preceding example can be rewritten as follows in NebulaGraph master to produce the same results:
GO FROM \"player100\" OVER follow YIELD properties(edge).degree;\nGO FROM \"player100\" OVER follow YIELD src(edge), dst(edge), type(edge), rank(edge);\n
NebulaGraph master is compatible with both new and old syntax.
"},{"location":"3.ngql-guide/4.variable-and-composite-queries/3.property-reference/#use_property_references_for_composite_queries","title":"Use property references for composite queries","text":"The following composite query performs the following actions:
$-.id
to get the results of the statement GO FROM \"player100\" OVER follow YIELD dst(edge) AS id
, which returns the destination vertex ID of the follow
edge type.properties($^)
function to get the name property of the player tag on the source vertex of the serve
edge type.properties($$)
function to get the name property of the team tag on the destination vertex of the serve
edge type.nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id | \\\n GO FROM $-.id OVER serve \\\n YIELD properties($^).name AS Player, properties($$).name AS Team;\n+-----------------+-----------+\n| Player | Team |\n+-----------------+-----------+\n| \"Tony Parker\" | \"Spurs\" |\n| \"Tony Parker\" | \"Hornets\" |\n| \"Manu Ginobili\" | \"Spurs\" |\n+-----------------+-----------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/","title":"Comparison operators","text":"NebulaGraph supports the following comparison operators.
Name Description==
Equal operator !=
, <>
Not equal operator >
Greater than operator >=
Greater than or equal operator <
Less than operator <=
Less than or equal operator IS NULL
NULL check IS NOT NULL
Not NULL check IS EMPTY
EMPTY check IS NOT EMPTY
Not EMPTY check The result of the comparison operation is true
or false
.
Note
NULL
or others.EMPTY
is currently used only for checking, and does not support functions or operations such as GROUP BY
, count()
, sum()
, max()
, hash()
, collect()
, +
or *
.openCypher does not have EMPTY
. Thus EMPTY
is not supported in MATCH statements.
==
","text":"String comparisons are case-sensitive. Values of different types are not equal.
Note
The equal operator is ==
in nGQL, while in openCypher it is =
.
nebula> RETURN 'A' == 'a', toUpper('A') == toUpper('a'), toLower('A') == toLower('a');\n+------------+------------------------------+------------------------------+\n| (\"A\"==\"a\") | (toUpper(\"A\")==toUpper(\"a\")) | (toLower(\"A\")==toLower(\"a\")) |\n+------------+------------------------------+------------------------------+\n| false | true | true |\n+------------+------------------------------+------------------------------+\n\nnebula> RETURN '2' == 2, toInteger('2') == 2;\n+----------+---------------------+\n| (\"2\"==2) | (toInteger(\"2\")==2) |\n+----------+---------------------+\n| false | true |\n+----------+---------------------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_2","title":">
","text":"nebula> RETURN 3 > 2;\n+-------+\n| (3>2) |\n+-------+\n| true |\n+-------+\n\nnebula> WITH 4 AS one, 3 AS two \\\n RETURN one > two AS result;\n+--------+\n| result |\n+--------+\n| true |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_3","title":">=
","text":"nebula> RETURN 2 >= \"2\", 2 >= 2;\n+----------+--------+\n| (2>=\"2\") | (2>=2) |\n+----------+--------+\n| __NULL__ | true |\n+----------+--------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_4","title":"<
","text":"nebula> YIELD 2.0 < 1.9;\n+---------+\n| (2<1.9) |\n+---------+\n| false |\n+---------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_5","title":"<=
","text":"nebula> YIELD 0.11 <= 0.11;\n+--------------+\n| (0.11<=0.11) |\n+--------------+\n| true |\n+--------------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#_6","title":"!=
","text":"nebula> YIELD 1 != '1';\n+----------+\n| (1!=\"1\") |\n+----------+\n| true |\n+----------+\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#is_not_null","title":"IS [NOT] NULL
","text":"nebula> RETURN null IS NULL AS value1, null == null AS value2, null != null AS value3;\n+--------+----------+----------+\n| value1 | value2 | value3 |\n+--------+----------+----------+\n| true | __NULL__ | __NULL__ |\n+--------+----------+----------+\n\nnebula> RETURN length(NULL), size(NULL), count(NULL), NULL IS NULL, NULL IS NOT NULL, sin(NULL), NULL + NULL, [1, NULL] IS NULL;\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n| length(NULL) | size(NULL) | count(NULL) | NULL IS NULL | NULL IS NOT NULL | sin(NULL) | (NULL+NULL) | [1,NULL] IS NULL |\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n| __NULL__ | __NULL__ | 0 | true | false | __NULL__ | __NULL__ | false |\n+--------------+------------+-------------+--------------+------------------+-----------+-------------+------------------+\n\nnebula> WITH {name: null} AS `map` \\\n RETURN `map`.name IS NOT NULL;\n+----------------------+\n| map.name IS NOT NULL |\n+----------------------+\n| false |\n+----------------------+\n\nnebula> WITH {name: 'Mats', name2: 'Pontus'} AS map1, \\\n {name: null} AS map2, {notName: 0, notName2: null } AS map3 \\\n RETURN map1.name IS NULL, map2.name IS NOT NULL, map3.name IS NULL;\n+-------------------+-----------------------+-------------------+\n| map1.name IS NULL | map2.name IS NOT NULL | map3.name IS NULL |\n+-------------------+-----------------------+-------------------+\n| false | false | true |\n+-------------------+-----------------------+-------------------+\n\nnebula> MATCH (n:player) \\\n RETURN n.player.age IS NULL, n.player.name IS NOT NULL, n.player.empty IS NULL;\n+----------------------+---------------------------+------------------------+\n| n.player.age IS NULL | n.player.name IS NOT NULL | n.player.empty IS NULL |\n+----------------------+---------------------------+------------------------+\n| false | true | true |\n| false | true | true |\n...\n
"},{"location":"3.ngql-guide/5.operators/1.comparison/#is_not_empty","title":"IS [NOT] EMPTY
","text":"nebula> RETURN null IS EMPTY;\n+---------------+\n| NULL IS EMPTY |\n+---------------+\n| false |\n+---------------+\n\nnebula> RETURN \"a\" IS NOT EMPTY;\n+------------------+\n| \"a\" IS NOT EMPTY |\n+------------------+\n| true |\n+------------------+\n\nnebula> GO FROM \"player100\" OVER * WHERE properties($$).name IS NOT EMPTY YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"team204\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
"},{"location":"3.ngql-guide/5.operators/10.arithmetic/","title":"Arithmetic operators","text":"NebulaGraph supports the following arithmetic operators.
Name Description+
Addition operator -
Minus operator *
Multiplication operator /
Division operator %
Modulo operator -
Changes the sign of the argument"},{"location":"3.ngql-guide/5.operators/10.arithmetic/#examples","title":"Examples","text":"nebula> RETURN 1+2 AS result;\n+--------+\n| result |\n+--------+\n| 3 |\n+--------+\n\nnebula> RETURN -10+5 AS result;\n+--------+\n| result |\n+--------+\n| -5 |\n+--------+\n\nnebula> RETURN (3*8)%5 AS result;\n+--------+\n| result |\n+--------+\n| 4 |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/2.boolean/","title":"Boolean operators","text":"NebulaGraph supports the following boolean operators.
Name Description AND Logical AND NOT Logical NOT OR Logical OR XOR Logical XORFor the precedence of the operators, refer to Operator Precedence.
For the logical operations with NULL
, refer to NULL.
Multiple queries can be combined using pipe operators in nGQL.
"},{"location":"3.ngql-guide/5.operators/4.pipe/#opencypher_compatibility","title":"OpenCypher compatibility","text":"Pipe operators apply to native nGQL only.
"},{"location":"3.ngql-guide/5.operators/4.pipe/#syntax","title":"Syntax","text":"One major difference between nGQL and SQL is how sub-queries are composed.
PIPE (|)
is introduced into the sub-queries.nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dstid, properties($$).name AS Name | \\\n GO FROM $-.dstid OVER follow YIELD dst(edge);\n\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n| \"player100\" |\n+-------------+\n
Users must define aliases in the YIELD
clause for the reference operator $-
to use, just like $-.dstid
in the preceding example.
In NebulaGraph, pipes will affect the performance. Take A | B
as an example, the effects are as follows:
Pipe operators operate synchronously. That is, the data can enter the pipe clause as a whole after the execution of clause A
before the pipe operator is completed.
If A
sends a large amount of data to |
, the entire query request may be very slow. You can try to split this statement.
Send A
from the application,
Split the return results on the application,
Send to multiple graphd processes concurrently,
Every graphd process executes part of B.
This is usually much faster than executing a complete A | B
with a single graphd process.
This topic will describe the set operators, including UNION
, UNION ALL
, INTERSECT
, and MINUS
. To combine multiple queries, use these set operators.
All set operators have equal precedence. If a nGQL statement contains multiple set operators, NebulaGraph will evaluate them from left to right unless parentheses explicitly specify another order.
Caution
The names and order of the variables defined in the query statements before and after the set operator must be consistent. For example, the names and order of a,b,c
in RETURN a,b,c UNION RETURN a,b,c
need to be consistent.
<left> UNION [DISTINCT | ALL] <right> [ UNION [DISTINCT | ALL] <right> ...]\n
UNION DISTINCT
(or by short UNION
) returns the union of two sets A and B without duplicated elements.UNION ALL
returns the union of two sets A and B with duplicated elements.<left>
and <right>
must have the same number of columns and data types. Different data types are converted according to the Type Conversion.# The following statement returns the union of two query results without duplicated elements.\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) \\\n UNION \\\n GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n\nnebula> MATCH (v:player) \\\n WITH v.player.name AS v \\\n RETURN n ORDER BY n LIMIT 3 \\\n UNION \\\n UNWIND [\"Tony Parker\", \"Ben Simmons\"] AS n \\\n RETURN n;\n+---------------------+\n| n |\n+---------------------+\n| \"Amar'e Stoudemire\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Tony Parker\" |\n+---------------------+\n\n# The following statement returns the union of two query results with duplicated elements.\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) \\\n UNION ALL \\\n GO FROM \"player100\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n\nnebula> MATCH (v:player) \\\n WITH v.player.name AS n \\\n RETURN n ORDER BY n LIMIT 3 \\\n UNION ALL \\\n UNWIND [\"Tony Parker\", \"Ben Simmons\"] AS n \\\n RETURN n;\n+---------------------+\n| n |\n+---------------------+\n| \"Amar'e Stoudemire\" |\n| \"Aron Baynes\" |\n| \"Ben Simmons\" |\n| \"Tony Parker\" |\n| \"Ben Simmons\" |\n+---------------------+\n\n# UNION can also work with the YIELD statement. The DISTINCT keyword will check duplication by all the columns for every line, and remove duplicated lines if every column is the same.\nnebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age \\\n UNION /* DISTINCT */ \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age;\n+-------------+--------+-----+\n| id | Degree | Age |\n+-------------+--------+-----+\n| \"player100\" | 75 | 42 |\n| \"player101\" | 75 | 36 |\n| \"player101\" | 95 | 36 |\n| \"player125\" | 95 | 41 |\n+-------------+--------+-----+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#intersect","title":"INTERSECT","text":"<left> INTERSECT <right>\n
INTERSECT
returns the intersection of two sets A and B (denoted by A \u22c2 B).UNION
, the left
and right
must have the same number of columns and data types. Different data types are converted according to the Type Conversion.# The following statement returns the intersection of two query results.\nnebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age \\\n INTERSECT \\\n GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS id, properties(edge).degree AS Degree, properties($$).age AS Age;\n+----+--------+-----+\n| id | Degree | Age |\n+----+--------+-----+\n+----+--------+-----+\n\nnebula> MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) == \"player102\" \\\n RETURN id(v2) As id, e.degree As Degree, v2.player.age AS Age \\\n INTERSECT \\\n MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) == \"player100\" \\\n RETURN id(v2) As id, e.degree As Degree, v2.player.age AS Age;\n+----+--------+-----+\n| id | Degree | Age |\n+----+--------+-----+\n+----+--------+-----+\n\nnebula> UNWIND [1,2] AS a RETURN a \\\n INTERSECT \\\n UNWIND [1,2,3,4] AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n+---+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#minus","title":"MINUS","text":"<left> MINUS <right>\n
Operator MINUS
returns the subtraction (or difference) of two sets A and B (denoted by A-B
). Always pay attention to the order of left
and right
. The set A-B
consists of elements that are in A but not in B.
# The following statement returns the elements in the first query result but not in the second query result.\nnebula> GO FROM \"player100\" OVER follow YIELD dst(edge) \\\n MINUS \\\n GO FROM \"player102\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player125\" |\n+-------------+\n\nnebula> GO FROM \"player102\" OVER follow YIELD dst(edge) AS id\\\n MINUS \\\n GO FROM \"player100\" OVER follow YIELD dst(edge) AS id;\n+-------------+\n| id |\n+-------------+\n| \"player100\" |\n+-------------+\n\nnebula> MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) ==\"player102\" \\\n RETURN id(v2) AS id\\\n MINUS \\\n MATCH (v:player)-[e:follow]->(v2) \\\n WHERE id(v) ==\"player100\" \\\n RETURN id(v2) AS id;\n+-------------+\n| id |\n+-------------+\n| \"player100\" |\n+-------------+\n\nnebula> UNWIND [1,2,3] AS a RETURN a \\\n MINUS \\\n WITH 4 AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n| 3 |\n+---+\n
"},{"location":"3.ngql-guide/5.operators/6.set/#precedence_of_the_set_operators_and_pipe_operators","title":"Precedence of the set operators and pipe operators","text":"Please note that when a query contains a pipe |
and a set operator, the pipe takes precedence. Refer to Pipe for details. The query GO FROM 1 UNION GO FROM 2 | GO FROM 3
is the same as the query GO FROM 1 UNION (GO FROM 2 | GO FROM 3)
.
nebula> GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS play_dst \\\n UNION \\\n GO FROM \"team200\" OVER serve REVERSELY \\\n YIELD src(edge) AS play_src \\\n | GO FROM $-.play_src OVER follow YIELD dst(edge) AS play_dst;\n\n+-------------+\n| play_dst |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player117\" |\n| \"player105\" |\n+-------------+\n
The above query executes the statements in the red bar first and then executes the statement in the green box.
The parentheses can change the execution priority. For example:
nebula> (GO FROM \"player102\" OVER follow \\\n YIELD dst(edge) AS play_dst \\\n UNION \\\n GO FROM \"team200\" OVER serve REVERSELY \\\n YIELD src(edge) AS play_dst) \\\n | GO FROM $-.play_dst OVER follow YIELD dst(edge) AS play_dst;\n
In the above query, the statements within the parentheses take precedence. That is, the UNION
operation will be executed first, and its output will be executed as the input of the next operation with pipes.
You can use the following string operators for concatenating, querying, and matching.
Name Description + Concatenates strings. CONTAINS Performs searchings in strings. (NOT) IN Checks whether a value is within a set of values. (NOT) STARTS WITH Performs matchings at the beginning of a string. (NOT) ENDS WITH Performs matchings at the end of a string. Regular expressions Perform string matchings using regular expressions.Note
All the string searchings or matchings are case-sensitive.
"},{"location":"3.ngql-guide/5.operators/7.string/#examples","title":"Examples","text":""},{"location":"3.ngql-guide/5.operators/7.string/#_1","title":"+
","text":"nebula> RETURN 'a' + 'b';\n+-----------+\n| (\"a\"+\"b\") |\n+-----------+\n| \"ab\" |\n+-----------+\nnebula> UNWIND 'a' AS a UNWIND 'b' AS b RETURN a + b;\n+-------+\n| (a+b) |\n+-------+\n| \"ab\" |\n+-------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#contains","title":"CONTAINS
","text":"The CONTAINS
operator requires string types on both left and right sides.
nebula> MATCH (s:player)-[e:serve]->(t:team) WHERE id(s) == \"player101\" \\\n AND t.team.name CONTAINS \"ets\" RETURN s.player.name, e.start_year, e.end_year, t.team.name;\n+---------------+--------------+------------+-------------+\n| s.player.name | e.start_year | e.end_year | t.team.name |\n+---------------+--------------+------------+-------------+\n| \"Tony Parker\" | 2018 | 2019 | \"Hornets\" |\n+---------------+--------------+------------+-------------+\n\nnebula> GO FROM \"player101\" OVER serve WHERE (STRING)properties(edge).start_year CONTAINS \"19\" AND \\\n properties($^).name CONTAINS \"ny\" \\\n YIELD properties($^).name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------------+-----------------------------+---------------------------+---------------------+\n| properties($^).name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------------+-----------------------------+---------------------------+---------------------+\n| \"Tony Parker\" | 1999 | 2018 | \"Spurs\" |\n+---------------------+-----------------------------+---------------------------+---------------------+\n\nnebula> GO FROM \"player101\" OVER serve WHERE !(properties($$).name CONTAINS \"ets\") \\\n YIELD properties($^).name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------------+-----------------------------+---------------------------+---------------------+\n| properties($^).name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------------+-----------------------------+---------------------------+---------------------+\n| \"Tony Parker\" | 1999 | 2018 | \"Spurs\" |\n+---------------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_in","title":"(NOT) IN
","text":"nebula> RETURN 1 IN [1,2,3], \"Yao\" NOT IN [\"Yi\", \"Tim\", \"Kobe\"], NULL IN [\"Yi\", \"Tim\", \"Kobe\"];\n+----------------+------------------------------------+-------------------------------+\n| (1 IN [1,2,3]) | (\"Yao\" NOT IN [\"Yi\",\"Tim\",\"Kobe\"]) | (NULL IN [\"Yi\",\"Tim\",\"Kobe\"]) |\n+----------------+------------------------------------+-------------------------------+\n| true | true | __NULL__ |\n+----------------+------------------------------------+-------------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_starts_with","title":"(NOT) STARTS WITH
","text":"nebula> RETURN 'apple' STARTS WITH 'app', 'apple' STARTS WITH 'a', 'apple' STARTS WITH toUpper('a');\n+-----------------------------+---------------------------+------------------------------------+\n| (\"apple\" STARTS WITH \"app\") | (\"apple\" STARTS WITH \"a\") | (\"apple\" STARTS WITH toUpper(\"a\")) |\n+-----------------------------+---------------------------+------------------------------------+\n| true | true | false |\n+-----------------------------+---------------------------+------------------------------------+\n\nnebula> RETURN 'apple' STARTS WITH 'b','apple' NOT STARTS WITH 'app';\n+---------------------------+---------------------------------+\n| (\"apple\" STARTS WITH \"b\") | (\"apple\" NOT STARTS WITH \"app\") |\n+---------------------------+---------------------------------+\n| false | false |\n+---------------------------+---------------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#not_ends_with","title":"(NOT) ENDS WITH
","text":"nebula> RETURN 'apple' ENDS WITH 'app', 'apple' ENDS WITH 'e', 'apple' ENDS WITH 'E', 'apple' ENDS WITH 'b';\n+---------------------------+-------------------------+-------------------------+-------------------------+\n| (\"apple\" ENDS WITH \"app\") | (\"apple\" ENDS WITH \"e\") | (\"apple\" ENDS WITH \"E\") | (\"apple\" ENDS WITH \"b\") |\n+---------------------------+-------------------------+-------------------------+-------------------------+\n| false | true | false | false |\n+---------------------------+-------------------------+-------------------------+-------------------------+\n
"},{"location":"3.ngql-guide/5.operators/7.string/#regular_expressions","title":"Regular expressions","text":"Note
Regular expressions cannot work with native nGQL statements (GO
, FETCH
, LOOKUP
, etc.). Use it in openCypher only (MATCH
, WHERE
, etc.).
NebulaGraph supports filtering by using regular expressions. The regular expression syntax is inherited from std::regex
. You can match on regular expressions by using =~ 'regexp'
. For example:
nebula> RETURN \"384748.39\" =~ \"\\\\d+(\\\\.\\\\d{2})?\";\n+--------------------------------+\n| (\"384748.39\"=~\"\\d+(\\.\\d{2})?\") |\n+--------------------------------+\n| true |\n+--------------------------------+\n\nnebula> MATCH (v:player) WHERE v.player.name =~ 'Tony.*' RETURN v.player.name;\n+---------------+\n| v.player.name |\n+---------------+\n| \"Tony Parker\" |\n+---------------+\n
"},{"location":"3.ngql-guide/5.operators/8.list/","title":"List operators","text":"NebulaGraph supports the following list operators:
List operator Description + Concatenates lists. IN Checks if an element exists in a list. [] Accesses an element(s) in a list using the index operator."},{"location":"3.ngql-guide/5.operators/8.list/#examples","title":"Examples","text":"nebula> YIELD [1,2,3,4,5]+[6,7] AS myList;\n+-----------------------+\n| myList |\n+-----------------------+\n| [1, 2, 3, 4, 5, 6, 7] |\n+-----------------------+\n\nnebula> RETURN size([NULL, 1, 2]);\n+------------------+\n| size([NULL,1,2]) |\n+------------------+\n| 3 |\n+------------------+\n\nnebula> RETURN NULL IN [NULL, 1];\n+--------------------+\n| (NULL IN [NULL,1]) |\n+--------------------+\n| __NULL__ |\n+--------------------+\n\nnebula> WITH [2, 3, 4, 5] AS numberlist \\\n UNWIND numberlist AS number \\\n WITH number \\\n WHERE number IN [2, 3, 8] \\\n RETURN number;\n+--------+\n| number |\n+--------+\n| 2 |\n| 3 |\n+--------+\n\nnebula> WITH ['Anne', 'John', 'Bill', 'Diane', 'Eve'] AS names RETURN names[1] AS result;\n+--------+\n| result |\n+--------+\n| \"John\" |\n+--------+\n
"},{"location":"3.ngql-guide/5.operators/9.precedence/","title":"Operator precedence","text":"The following list shows the precedence of nGQL operators in descending order. Operators that are shown together on a line have the same precedence.
-
(negative number)!
, NOT
*
, /
, %
-
, +
==
, >=
, >
, <=
, <
, <>
, !=
AND
OR
, XOR
=
(assignment)For operators that occur at the same precedence level within an expression, evaluation proceeds left to right, with the exception that assignments evaluate right to left.
The precedence of operators determines the order of evaluation of terms in an expression. To modify this order and group terms explicitly, use parentheses.
"},{"location":"3.ngql-guide/5.operators/9.precedence/#examples","title":"Examples","text":"nebula> RETURN 2+3*5;\n+-----------+\n| (2+(3*5)) |\n+-----------+\n| 17 |\n+-----------+\n\nnebula> RETURN (2+3)*5;\n+-----------+\n| ((2+3)*5) |\n+-----------+\n| 25 |\n+-----------+\n
"},{"location":"3.ngql-guide/5.operators/9.precedence/#opencypher_compatibility","title":"OpenCypher compatibility","text":"In openCypher, comparisons can be chained arbitrarily, e.g., x < y <= z
is equivalent to x < y AND y <= z
in openCypher.
But in nGQL, x < y <= z
is equivalent to (x < y) <= z
. The result of (x < y)
is a boolean. Compare it with an integer z
, and you will get the final result NULL
.
This topic describes the built-in math functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#abs","title":"abs()","text":"abs() returns the absolute value of the argument.
Syntax: abs(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN abs(-10);\n+------------+\n| abs(-(10)) |\n+------------+\n| 10 |\n+------------+\nnebula> RETURN abs(5-6);\n+------------+\n| abs((5-6)) |\n+------------+\n| 1 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#floor","title":"floor()","text":"floor() returns the largest integer value smaller than or equal to the argument.(Rounds down)
Syntax: floor(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN floor(9.9);\n+------------+\n| floor(9.9) |\n+------------+\n| 9.0 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#ceil","title":"ceil()","text":"ceil() returns the smallest integer greater than or equal to the argument.(Rounds up)
Syntax: ceil(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN ceil(9.1);\n+-----------+\n| ceil(9.1) |\n+-----------+\n| 10.0 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#round","title":"round()","text":"round() returns the rounded value of the specified number. Pay attention to the floating-point precision when using this function.
Syntax: round(<expression>, <digit>)
expression
: An expression of which the result type is double.digit
: Decimal digits. If digit
is less than 0, round at the left of the decimal point.Example:
nebula> RETURN round(314.15926, 2);\n+--------------------+\n| round(314.15926,2) |\n+--------------------+\n| 314.16 |\n+--------------------+\nnebula> RETURN round(314.15926, -1);\n+-----------------------+\n| round(314.15926,-(1)) |\n+-----------------------+\n| 310.0 |\n+-----------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sqrt","title":"sqrt()","text":"sqrt() returns the square root of the argument.
Syntax: sqrt(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sqrt(9);\n+---------+\n| sqrt(9) |\n+---------+\n| 3.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#cbrt","title":"cbrt()","text":"cbrt() returns the cubic root of the argument.
Syntax: cbrt(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN cbrt(8);\n+---------+\n| cbrt(8) |\n+---------+\n| 2.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#hypot","title":"hypot()","text":"hypot() returns the hypotenuse of a right-angled triangle.
Syntax: hypot(<expression_x>,<expression_y>)
expression_x
, expression_y
: An expression of which the result type is double. They represent the side lengths x and y of a right triangle.Example:
nebula> RETURN hypot(3,2*2);\n+----------------+\n| hypot(3,(2*2)) |\n+----------------+\n| 5.0 |\n+----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#pow","title":"pow()","text":"pow() returns the result of xy.
Syntax: pow(<expression_x>,<expression_y>,)
expression_x
: An expression of which the result type is double. It represents the base x
.expression_y
: An expression of which the result type is double. It represents the exponential y
.Example:
nebula> RETURN pow(3,3);\n+----------+\n| pow(3,3) |\n+----------+\n| 27 |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#exp","title":"exp()","text":"exp() returns the result of ex.
Syntax: exp(<expression>)
expression
: An expression of which the result type is double. It represents the exponential x
.Example:
nebula> RETURN exp(2);\n+------------------+\n| exp(2) |\n+------------------+\n| 7.38905609893065 |\n+------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#exp2","title":"exp2()","text":"exp2() returns the result of 2x.
Syntax: exp2(<expression>)
expression
: An expression of which the result type is double. It represents the exponential x
.Example:
nebula> RETURN exp2(3);\n+---------+\n| exp2(3) |\n+---------+\n| 8.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log","title":"log()","text":"log() returns the base-e logarithm of the argument. (\\(log_{e}{N}\\))
Syntax: log(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log(8);\n+--------------------+\n| log(8) |\n+--------------------+\n| 2.0794415416798357 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log2","title":"log2()","text":"log2() returns the base-2 logarithm of the argument. (\\(log_{2}{N}\\))
Syntax: log2(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log2(8);\n+---------+\n| log2(8) |\n+---------+\n| 3.0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#log10","title":"log10()","text":"log10() returns the base-10 logarithm of the argument. (\\(log_{10}{N}\\))
Syntax: log10(<expression>)
expression
: An expression of which the result type is double. It represents the antilogarithm N
.Example:
nebula> RETURN log10(100);\n+------------+\n| log10(100) |\n+------------+\n| 2.0 |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sin","title":"sin()","text":"sin() returns the sine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: sin(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sin(3);\n+--------------------+\n| sin(3) |\n+--------------------+\n| 0.1411200080598672 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#asin","title":"asin()","text":"asin() returns the inverse sine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: asin(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN asin(0.5);\n+--------------------+\n| asin(0.5) |\n+--------------------+\n| 0.5235987755982989 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#cos","title":"cos()","text":"cos() returns the cosine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: cos(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN cos(0.5);\n+--------------------+\n| cos(0.5) |\n+--------------------+\n| 0.8775825618903728 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#acos","title":"acos()","text":"acos() returns the inverse cosine of the argument. Users can convert angles to radians using the function radians()
.
Syntax: acos(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN acos(0.5);\n+--------------------+\n| acos(0.5) |\n+--------------------+\n| 1.0471975511965979 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#tan","title":"tan()","text":"tan() returns the tangent of the argument. Users can convert angles to radians using the function radians()
.
Syntax: tan(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN tan(0.5);\n+--------------------+\n| tan(0.5) |\n+--------------------+\n| 0.5463024898437905 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#atan","title":"atan()","text":"atan() returns the inverse tangent of the argument. Users can convert angles to radians using the function radians()
.
Syntax: atan(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN atan(0.5);\n+--------------------+\n| atan(0.5) |\n+--------------------+\n| 0.4636476090008061 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand","title":"rand()","text":"rand() returns a random floating point number in the range from 0 (inclusive) to 1 (exclusive); i.e.[0,1).
Syntax: rand()
Example:
nebula> RETURN rand();\n+--------------------+\n| rand() |\n+--------------------+\n| 0.6545837172298736 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand32","title":"rand32()","text":"rand32() returns a random 32-bit integer in [min, max)
.
Syntax: rand32(<expression_min>,<expression_max>)
expression_min
: An expression of which the result type is int. It represents the minimum min
.expression_max
: An expression of which the result type is int. It represents the maximum max
.max
and min
is 0
by default. If you set no argument, the system returns a random signed 32-bit integer.Example:
nebula> RETURN rand32(1,100);\n+---------------+\n| rand32(1,100) |\n+---------------+\n| 63 |\n+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#rand64","title":"rand64()","text":"rand64() returns a random 64-bit integer in [min, max)
.
Syntax: rand64(<expression_min>,<expression_max>)
expression_min
: An expression of which the result type is int. It represents the minimum min
.expression_max
: An expression of which the result type is int. It represents the maximum max
.max
and min
is 0
by default. If you set no argument, the system returns a random signed 64-bit integer.Example:
nebula> RETURN rand64(1,100);\n+---------------+\n| rand64(1,100) |\n+---------------+\n| 34 |\n+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_and","title":"bit_and()","text":"bit_and() returns the result of bitwise AND.
Syntax: bit_and(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_and(5,6);\n+--------------+\n| bit_and(5,6) |\n+--------------+\n| 4 |\n+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_or","title":"bit_or()","text":"bit_or() returns the result of bitwise OR.
Syntax: bit_or(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_or(5,6);\n+-------------+\n| bit_or(5,6) |\n+-------------+\n| 7 |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#bit_xor","title":"bit_xor()","text":"bit_xor() returns the result of bitwise XOR.
Syntax: bit_xor(<expression_1>,<expression_2>)
expression_1
, expression_2
: An expression of which the result type is int.Example:
nebula> RETURN bit_xor(5,6);\n+--------------+\n| bit_xor(5,6) |\n+--------------+\n| 3 |\n+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#size","title":"size()","text":"size() returns the number of elements in a list or a map, or the length of a string.
Syntax: size({<expression>|<string>})
expression
: An expression for a list or map.string
: A specified string.Example:
nebula> RETURN size([1,2,3,4]);\n+-----------------+\n| size([1,2,3,4]) |\n+-----------------+\n| 4 |\n+-----------------+\n
nebula> RETURN size(\"basketballplayer\") as size;\n+------+\n| size |\n+------+\n| 16 |\n+------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#range","title":"range()","text":"range() returns a list of integers from [start,end]
in the specified steps.
Syntax: range(<expression_start>,<expression_end>[,<expression_step>])
expression_start
: An expression of which the result type is int. It represents the starting value start
.expression_end
: An expression of which the result type is int. It represents the end value end
.expression_step
: An expression of which the result type is int. It represents the step size step
, step
is 1 by default.Example:
nebula> RETURN range(1,3*3,2);\n+------------------+\n| range(1,(3*3),2) |\n+------------------+\n| [1, 3, 5, 7, 9] |\n+------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#sign","title":"sign()","text":"sign() returns the signum of the given number. If the number is 0
, the system returns 0
. If the number is negative, the system returns -1
. If the number is positive, the system returns 1
.
Syntax: sign(<expression>)
expression
: An expression of which the result type is double.Example:
nebula> RETURN sign(10);\n+----------+\n| sign(10) |\n+----------+\n| 1 |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#e","title":"e()","text":"e() returns the base of the natural logarithm, e (2.718281828459045).
Syntax: e()
Example:
nebula> RETURN e();\n+-------------------+\n| e() |\n+-------------------+\n| 2.718281828459045 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#pi","title":"pi()","text":"pi() returns the mathematical constant pi (3.141592653589793).
Syntax: pi()
Example:
nebula> RETURN pi();\n+-------------------+\n| pi() |\n+-------------------+\n| 3.141592653589793 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/1.math/#radians","title":"radians()","text":"radians() converts angles to radians.
Syntax: radians(<angle>)
Example:
nebula> RETURN radians(180);\n+-------------------+\n| radians(180) |\n+-------------------+\n| 3.141592653589793 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/","title":"Geography functions","text":"Geography functions are used to generate or perform operations on the value of the geography data type.
For descriptions of the geography data types, see Geography.
"},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/#descriptions","title":"Descriptions","text":"Function Return Type Description ST_Point(longitude, latitude)GEOGRAPHY
Creates the geography that contains a point. ST_GeogFromText(wkt_string) GEOGRAPHY
Returns the geography corresponding to the input WKT string. ST_ASText(geography) STRING
Returns the WKT string of the input geography. ST_Centroid(geography) GEOGRAPHY
Returns the centroid of the input geography in the form of the single point geography. ST_ISValid(geography) BOOL
Returns whether the input geography is valid. ST_Intersects(geography_1, geography_2) BOOL
Returns whether geography_1 and geography_2 have intersections. ST_Covers(geography_1, geography_2) BOOL
Returns whether geography_1 completely contains geography_2. If there is no point outside geography_1 in geography_2, return True. ST_CoveredBy(geography_1, geography_2) BOOL
Returns whether geography_2 completely contains geography_1.If there is no point outside geography_2 in geography_1, return True. ST_DWithin(geography_1, geography_2, distance) BOOL
If the distance between one point (at least) in geography_1 and one point in geography_2 is less than or equal to the distance specified by the distance parameter (measured by meters), return True. ST_Distance(geography_1, geography_2) FLOAT
Returns the smallest possible distance (measured by meters) between two non-empty geographies. S2_CellIdFromPoint(point_geography) INT
Returns the S2 Cell ID that covers the point geography. S2_CoveringCellIds(geography) ARRAY<INT64>
Returns an array of S2 Cell IDs that cover the input geography."},{"location":"3.ngql-guide/6.functions-and-expressions/14.geo/#examples","title":"Examples","text":"nebula> RETURN ST_ASText(ST_Point(1,1));\n+--------------------------+\n| ST_ASText(ST_Point(1,1)) |\n+--------------------------+\n| \"POINT(1 1)\" |\n+--------------------------+\n\nnebula> RETURN ST_ASText(ST_GeogFromText(\"POINT(3 8)\"));\n+------------------------------------------+\n| ST_ASText(ST_GeogFromText(\"POINT(3 8)\")) |\n+------------------------------------------+\n| \"POINT(3 8)\" |\n+------------------------------------------+\n\nnebula> RETURN ST_ASTEXT(ST_Centroid(ST_GeogFromText(\"LineString(0 1,1 0)\")));\n+----------------------------------------------------------------+\n| ST_ASTEXT(ST_Centroid(ST_GeogFromText(\"LineString(0 1,1 0)\"))) |\n+----------------------------------------------------------------+\n| \"POINT(0.5000380800773782 0.5000190382261059)\" |\n+----------------------------------------------------------------+\n\nnebula> RETURN ST_ISValid(ST_GeogFromText(\"POINT(3 8)\"));\n+-------------------------------------------+\n| ST_ISValid(ST_GeogFromText(\"POINT(3 8)\")) |\n+-------------------------------------------+\n| true |\n+-------------------------------------------+\n\nnebula> RETURN ST_Intersects(ST_GeogFromText(\"LineString(0 1,1 0)\"),ST_GeogFromText(\"LineString(0 0,1 1)\"));\n+----------------------------------------------------------------------------------------------+\n| ST_Intersects(ST_GeogFromText(\"LineString(0 1,1 0)\"),ST_GeogFromText(\"LineString(0 0,1 1)\")) |\n+----------------------------------------------------------------------------------------------+\n| true |\n+----------------------------------------------------------------------------------------------+\n\nnebula> RETURN ST_Covers(ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"),ST_Point(1,2));\n+--------------------------------------------------------------------------------+\n| ST_Covers(ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"),ST_Point(1,2)) |\n+--------------------------------------------------------------------------------+\n| true |\n+--------------------------------------------------------------------------------+\n\nnebula> RETURN ST_CoveredBy(ST_Point(1,2),ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\"));\n+-----------------------------------------------------------------------------------+\n| ST_CoveredBy(ST_Point(1,2),ST_GeogFromText(\"POLYGON((0 0,10 0,10 10,0 10,0 0))\")) |\n+-----------------------------------------------------------------------------------+\n| true |\n+-----------------------------------------------------------------------------------+\n\nnebula> RETURN ST_dwithin(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"),20000000000.0);\n+---------------------------------------------------------------------------------------+\n| ST_dwithin(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"),20000000000) |\n+---------------------------------------------------------------------------------------+\n| true |\n+---------------------------------------------------------------------------------------+\n\nnebula> RETURN ST_Distance(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\"));\n+----------------------------------------------------------------------------+\n| ST_Distance(ST_GeogFromText(\"Point(0 0)\"),ST_GeogFromText(\"Point(10 10)\")) |\n+----------------------------------------------------------------------------+\n| 1.5685230187677438e+06 |\n+----------------------------------------------------------------------------+\n\nnebula> RETURN S2_CellIdFromPoint(ST_GeogFromText(\"Point(1 1)\"));\n+---------------------------------------------------+\n| S2_CellIdFromPoint(ST_GeogFromText(\"Point(1 1)\")) |\n+---------------------------------------------------+\n| 1153277837650709461 |\n+---------------------------------------------------+\n\nnebula> RETURN S2_CoveringCellIds(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\"));\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| S2_CoveringCellIds(ST_GeogFromText(\"POLYGON((0 1, 1 2, 2 3, 0 1))\")) |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [1152391494368201343, 1153466862374223872, 1153554823304445952, 1153836298281156608, 1153959443583467520, 1154240918560178176, 1160503736791990272, 1160591697722212352] |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/","title":"Aggregating functions","text":"This topic describes the aggregating functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#avg","title":"avg()","text":"avg() returns the average value of the argument.
Syntax: avg(<expression>)
Example:
nebula> MATCH (v:player) RETURN avg(v.player.age);\n+--------------------+\n| avg(v.player.age) |\n+--------------------+\n| 33.294117647058826 |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#count","title":"count()","text":"count() returns the number of records.
count()
and GROUP BY
together to group and count the number of parameters. Use YIELD
to return.count()
and RETURN
. GROUP BY
is not necessary.Syntax: count({<expression> | *})
Example:
nebula> WITH [NULL, 1, 1, 2, 2] As a UNWIND a AS b \\\n RETURN count(b), count(*), count(DISTINCT b);\n+----------+----------+-------------------+\n| count(b) | count(*) | count(distinct b) |\n+----------+----------+-------------------+\n| 4 | 5 | 2 |\n+----------+----------+-------------------+\n
# The statement in the following example searches for the people whom `player101` follows and people who follow `player101`, i.e. a bidirectional query.\n# Group and count the number of parameters.\nnebula> GO FROM \"player101\" OVER follow BIDIRECT \\\n YIELD properties($$).name AS Name \\\n | GROUP BY $-.Name YIELD $-.Name, count(*);\n+---------------------+----------+\n| $-.Name | count(*) |\n+---------------------+----------+\n| \"LaMarcus Aldridge\" | 2 |\n| \"Tim Duncan\" | 2 |\n| \"Marco Belinelli\" | 1 |\n| \"Manu Ginobili\" | 1 |\n| \"Boris Diaw\" | 1 |\n| \"Dejounte Murray\" | 1 |\n+---------------------+----------+\n\n# Count the number of parameters.\nnebula> MATCH (v1:player)-[:follow]-(v2:player) \\\n WHERE id(v1)== \"player101\" \\\n RETURN v2.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;\n+---------------------+-----+\n| Name | cnt |\n+---------------------+-----+\n| \"LaMarcus Aldridge\" | 2 |\n| \"Tim Duncan\" | 2 |\n| \"Boris Diaw\" | 1 |\n| \"Manu Ginobili\" | 1 |\n| \"Dejounte Murray\" | 1 |\n| \"Marco Belinelli\" | 1 |\n+---------------------+-----+\n
The preceding example retrieves two columns:
$-.Name
: the names of the people.count(*)
: how many times the names show up.Because there are no duplicate names in the basketballplayer
dataset, the number 2
in the column count(*)
shows that the person in that row and player101
have followed each other.
# a: The statement in the following example retrieves the age distribution of the players in the dataset.\nnebula> LOOKUP ON player \\\n YIELD player.age As playerage \\\n | GROUP BY $-.playerage \\\n YIELD $-.playerage as age, count(*) AS number \\\n | ORDER BY $-.number DESC, $-.age DESC;\n+-----+--------+\n| age | number |\n+-----+--------+\n| 34 | 4 |\n| 33 | 4 |\n| 30 | 4 |\n| 29 | 4 |\n| 38 | 3 |\n+-----+--------+\n...\n# b: The statement in the following example retrieves the age distribution of the players in the dataset.\nnebula> MATCH (n:player) \\\n RETURN n.player.age as age, count(*) as number \\\n ORDER BY number DESC, age DESC;\n+-----+--------+\n| age | number |\n+-----+--------+\n| 34 | 4 |\n| 33 | 4 |\n| 30 | 4 |\n| 29 | 4 |\n| 38 | 3 |\n+-----+--------+\n...\n
# The statement in the following example counts the number of edges that Tim Duncan relates.\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) -[e]- (v2) \\\n RETURN count(e);\n+----------+\n| count(e) |\n+----------+\n| 13 |\n+----------+\n\n# The statement in the following example counts the number of edges that Tim Duncan relates and returns two columns (no DISTINCT and DISTINCT) in multi-hop queries.\nnebula> MATCH (n:player {name : \"Tim Duncan\"})-[]->(friend:player)-[]->(fof:player) \\\n RETURN count(fof), count(DISTINCT fof);\n+------------+---------------------+\n| count(fof) | count(distinct fof) |\n+------------+---------------------+\n| 4 | 3 |\n+------------+---------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#max","title":"max()","text":"max() returns the maximum value.
Syntax: max(<expression>)
Example:
nebula> MATCH (v:player) RETURN max(v.player.age);\n+-------------------+\n| max(v.player.age) |\n+-------------------+\n| 47 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#min","title":"min()","text":"min() returns the minimum value.
Syntax: min(<expression>)
Example:
nebula> MATCH (v:player) RETURN min(v.player.age);\n+-------------------+\n| min(v.player.age) |\n+-------------------+\n| 20 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#collect","title":"collect()","text":"collect() returns a list containing the values returned by an expression. Using this function aggregates data by merging multiple records or values into a single list.
Syntax: collect(<expression>)
Example:
nebula> UNWIND [1, 2, 1] AS a \\\n RETURN a;\n+---+\n| a |\n+---+\n| 1 |\n| 2 |\n| 1 |\n+---+\n\nnebula> UNWIND [1, 2, 1] AS a \\\n RETURN collect(a);\n+------------+\n| collect(a) |\n+------------+\n| [1, 2, 1] |\n+------------+\n\nnebula> UNWIND [1, 2, 1] AS a \\\n RETURN a, collect(a), size(collect(a));\n+---+------------+------------------+\n| a | collect(a) | size(collect(a)) |\n+---+------------+------------------+\n| 2 | [2] | 1 |\n| 1 | [1, 1] | 2 |\n+---+------------+------------------+\n\n# The following examples sort the results in descending order, limit output rows to 3, and collect the output into a list.\nnebula> UNWIND [\"c\", \"b\", \"a\", \"d\" ] AS p \\\n WITH p AS q \\\n ORDER BY q DESC LIMIT 3 \\\n RETURN collect(q);\n+-----------------+\n| collect(q) |\n+-----------------+\n| [\"d\", \"c\", \"b\"] |\n+-----------------+\n\nnebula> WITH [1, 1, 2, 2] AS coll \\\n UNWIND coll AS x \\\n WITH DISTINCT x \\\n RETURN collect(x) AS ss;\n+--------+\n| ss |\n+--------+\n| [1, 2] |\n+--------+\n\nnebula> MATCH (n:player) \\\n RETURN collect(n.player.age);\n+---------------------------------------------------------------+\n| collect(n.player.age) |\n+---------------------------------------------------------------+\n| [32, 32, 34, 29, 41, 40, 33, 25, 40, 37, ...\n...\n\n# The following example aggregates all the players' names by their ages.\nnebula> MATCH (n:player) \\\n RETURN n.player.age AS age, collect(n.player.name);\n+-----+--------------------------------------------------------------------------+\n| age | collect(n.player.name) |\n+-----+--------------------------------------------------------------------------+\n| 24 | [\"Giannis Antetokounmpo\"] |\n| 20 | [\"Luka Doncic\"] |\n| 25 | [\"Joel Embiid\", \"Kyle Anderson\"] |\n+-----+--------------------------------------------------------------------------+\n...\n\nnebula> GO FROM \"player100\" OVER serve \\\n YIELD properties($$).name AS name \\\n | GROUP BY $-.name \\\n YIELD collect($-.name) AS name;\n+-----------+\n| name |\n+-----------+\n| [\"Spurs\"] |\n+-----------+\n\nnebula> LOOKUP ON player \\\n YIELD player.age As playerage \\\n | GROUP BY $-.playerage \\\n YIELD collect($-.playerage) AS playerage;\n+------------------+\n| playerage |\n+------------------+\n| [22] |\n| [47] |\n| [43] |\n| [25, 25] |\n+------------------+\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#std","title":"std()","text":"std() returns the population standard deviation.
Syntax: std(<expression>)
Example:
nebula> MATCH (v:player) RETURN std(v.player.age);\n+-------------------+\n| std(v.player.age) |\n+-------------------+\n| 6.423895701687502 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#sum","title":"sum()","text":"sum() returns the sum value.
Syntax: sum(<expression>)
Example:
nebula> MATCH (v:player) RETURN sum(v.player.age);\n+-------------------+\n| sum(v.player.age) |\n+-------------------+\n| 1698 |\n+-------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/15.aggregating/#aggregating_example","title":"Aggregating example","text":"nebula> GO FROM \"player100\" OVER follow YIELD dst(edge) AS dst, properties($$).age AS age \\\n | GROUP BY $-.dst \\\n YIELD \\\n $-.dst AS dst, \\\n toInteger((sum($-.age)/count($-.age)))+avg(distinct $-.age+1)+1 AS statistics;\n+-------------+------------+\n| dst | statistics |\n+-------------+------------+\n| \"player125\" | 84.0 |\n| \"player101\" | 74.0 |\n+-------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/","title":"Type conversion functions","text":"This topic describes the type conversion functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#toboolean","title":"toBoolean()","text":"toBoolean() converts a string value to a boolean value.
Syntax: toBoolean(<value>)
Example:
nebula> UNWIND [true, false, 'true', 'false', NULL] AS b \\\n RETURN toBoolean(b) AS b;\n+----------+\n| b |\n+----------+\n| true |\n| false |\n| true |\n| false |\n| __NULL__ |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tofloat","title":"toFloat()","text":"toFloat() converts an integer or string value to a floating point number.
Syntax: toFloat(<value>)
Example:
nebula> RETURN toFloat(1), toFloat('1.3'), toFloat('1e3'), toFloat('not a number');\n+------------+----------------+----------------+-------------------------+\n| toFloat(1) | toFloat(\"1.3\") | toFloat(\"1e3\") | toFloat(\"not a number\") |\n+------------+----------------+----------------+-------------------------+\n| 1.0 | 1.3 | 1000.0 | __NULL__ |\n+------------+----------------+----------------+-------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tostring","title":"toString()","text":"toString() converts non-compound types of data, such as numbers, booleans, and so on, to strings.
Syntax: toString(<value>)
Example:
nebula> RETURN toString(9669) AS int2str, toString(null) AS null2str;\n+---------+----------+\n| int2str | null2str |\n+---------+----------+\n| \"9669\" | __NULL__ |\n+---------+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#tointeger","title":"toInteger()","text":"toInteger() converts a floating point or string value to an integer value.
Syntax: toInteger(<value>)
Example:
nebula> RETURN toInteger(1), toInteger('1'), toInteger('1e3'), toInteger('not a number');\n+--------------+----------------+------------------+---------------------------+\n| toInteger(1) | toInteger(\"1\") | toInteger(\"1e3\") | toInteger(\"not a number\") |\n+--------------+----------------+------------------+---------------------------+\n| 1 | 1 | 1000 | __NULL__ |\n+--------------+----------------+------------------+---------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#toset","title":"toSet()","text":"toSet() converts a list or set value to a set value.
Syntax: toSet(<value>)
Example:
nebula> RETURN toSet(list[1,2,3,1,2]) AS list2set;\n+-----------+\n| list2set |\n+-----------+\n| {3, 1, 2} |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/16.type-conversion/#hash","title":"hash()","text":"hash() returns the hash value of the argument. The argument can be a number, a string, a list, a boolean, null, or an expression that evaluates to a value of the preceding data types.
The source code of the hash()
function (MurmurHash2), seed (0xc70f6907UL
), and other parameters can be found in MurmurHash2.h
.
For Java, the hash function operates as follows.
MurmurHash2.hash64(\"to_be_hashed\".getBytes(),\"to_be_hashed\".getBytes().length, 0xc70f6907)\n
Syntax: hash(<string>)
Example:
nebula> RETURN hash(\"abcde\");\n+--------------------+\n| hash(\"abcde\") |\n+--------------------+\n| 811036730794841393 |\n+--------------------+\nnebula> YIELD hash([1,2,3]);\n+----------------+\n| hash([1,2,3]) |\n+----------------+\n| 11093822460243 |\n+----------------+\nnebula> YIELD hash(NULL);\n+------------+\n| hash(NULL) |\n+------------+\n| -1 |\n+------------+\nnebula> YIELD hash(toLower(\"HELLO NEBULA\"));\n+-------------------------------+\n| hash(toLower(\"HELLO NEBULA\")) |\n+-------------------------------+\n| -8481157362655072082 |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/","title":"Built-in string functions","text":"This topic describes the built-in string functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#precautions","title":"Precautions","text":"1
, while in C language it starts from 0
.strcasecmp() compares string a and b without case sensitivity.
Syntax: strcasecmp(<string_a>,<string_b>)
string_a
, string_b
: Strings to compare.string_a = string_b
, the return value is 0
. When string_a > string_b
, the return value is greater than 0
. When string_a < string_b
, the return value is less than 0
.Example:
nebula> RETURN strcasecmp(\"a\",\"aa\");\n+----------------------+\n| strcasecmp(\"a\",\"aa\") |\n+----------------------+\n| -97 |\n+----------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#lower_and_tolower","title":"lower() and toLower()","text":"lower() and toLower() can both returns the argument in lowercase.
Syntax: lower(<string>)
, toLower(<string>)
string
: A specified string.Example:
nebula> RETURN lower(\"Basketball_Player\");\n+----------------------------+\n| lower(\"Basketball_Player\") |\n+----------------------------+\n| \"basketball_player\" |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#upper_and_toupper","title":"upper() and toUpper()","text":"upper() and toUpper() can both returns the argument in uppercase.
Syntax: upper(<string>)
, toUpper(<string>)
string
: A specified string.Example:
nebula> RETURN upper(\"Basketball_Player\");\n+----------------------------+\n| upper(\"Basketball_Player\") |\n+----------------------------+\n| \"BASKETBALL_PLAYER\" |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#length","title":"length()","text":"length() returns the length of the given string in bytes.
Syntax: length({<string>|<path>})
string
: A specified string.path
: A specified path represented by a variable.Example:
nebula> RETURN length(\"basketball\");\n+----------------------+\n| length(\"basketball\") |\n+----------------------+\n| 10 |\n+----------------------+\n
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) return length(p);\n+-----------+\n| length(p) |\n+-----------+\n| 1 |\n| 1 |\n| 1 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#trim","title":"trim()","text":"trim() removes the spaces at the leading and trailing of the string.
Syntax: trim(<string>)
string
: A specified string.Example:
nebula> RETURN trim(\" basketball player \");\n+-----------------------------+\n| trim(\" basketball player \") |\n+-----------------------------+\n| \"basketball player\" |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#ltrim","title":"ltrim()","text":"ltrim() removes the spaces at the leading of the string.
Syntax: ltrim(<string>)
string
: A specified string.Example:
nebula> RETURN ltrim(\" basketball player \");\n+------------------------------+\n| ltrim(\" basketball player \") |\n+------------------------------+\n| \"basketball player \" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#rtrim","title":"rtrim()","text":"rtrim() removes the spaces at the trailing of the string.
Syntax: rtrim(<string>)
string
: A specified string.Example:
nebula> RETURN rtrim(\" basketball player \");\n+------------------------------+\n| rtrim(\" basketball player \") |\n+------------------------------+\n| \" basketball player\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#left","title":"left()","text":"left() returns a substring consisting of several characters from the leading of a string.
Syntax: left(<string>,<count>)
string
: A specified string.count
: The number of characters from the leading of the string. If the string is shorter than count
, the system returns the string itself.Example:
nebula> RETURN left(\"basketball_player\",6);\n+-----------------------------+\n| left(\"basketball_player\",6) |\n+-----------------------------+\n| \"basket\" |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#right","title":"right()","text":"right() returns a substring consisting of several characters from the trailing of a string.
Syntax: right(<string>,<count>)
string
: A specified string.count
: The number of characters from the trailing of the string. If the string is shorter than count
, the system returns the string itself.Example:
nebula> RETURN right(\"basketball_player\",6);\n+------------------------------+\n| right(\"basketball_player\",6) |\n+------------------------------+\n| \"player\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#lpad","title":"lpad()","text":"lpad() pads a specified string from the left-side to the specified length and returns the result string.
Syntax: lpad(<string>,<count>,<letters>)
string
: A specified string.count
: The length of the string after it has been left-padded. If the length is less than that of string
, only the length of string
characters from front to back will be returned.letters
: A string to be padding from the leading.Example:
nebula> RETURN lpad(\"abcd\",10,\"b\");\n+---------------------+\n| lpad(\"abcd\",10,\"b\") |\n+---------------------+\n| \"bbbbbbabcd\" |\n+---------------------+\nnebula> RETURN lpad(\"abcd\",3,\"b\");\n+--------------------+\n| lpad(\"abcd\",3,\"b\") |\n+--------------------+\n| \"abc\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#rpad","title":"rpad()","text":"rpad() pads a specified string from the right-side to the specified length and returns the result string.
Syntax: rpad(<string>,<count>,<letters>)
string
: A specified string.count
: The length of the string after it has been right-padded. If the length is less than that of string
, only the length of string
characters from front to back will be returned.letters
: A string to be padding from the trailing.Example:
nebula> RETURN rpad(\"abcd\",10,\"b\");\n+---------------------+\n| rpad(\"abcd\",10,\"b\") |\n+---------------------+\n| \"abcdbbbbbb\" |\n+---------------------+\nnebula> RETURN rpad(\"abcd\",3,\"b\");\n+--------------------+\n| rpad(\"abcd\",3,\"b\") |\n+--------------------+\n| \"abc\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#substr_and_substring","title":"substr() and substring()","text":"substr() and substring() return a substring extracting count
characters starting from the specified position pos
of a specified string.
Syntax: substr(<string>,<pos>,<count>)
, substring(<string>,<pos>,<count>)
string
: A specified string.pos
: The position of starting extract (character index). Data type is int.count
: The number of characters extracted from the start position onwards.substr()
and substring()
","text":"pos
is 0, it extracts from the specified string leading (including the first character).pos
is greater than the maximum string index, an empty string is returned.pos
is a negative number, BAD_DATA
is returned.count
is omitted, the function returns the substring starting at the position given by pos
and extending to the end of the string.count
is 0, an empty string is returned.NULL
as any of the argument of substr()
will cause an issue.OpenCypher compatibility
In openCypher, if a
is null
, null
is returned.
Example:
nebula> RETURN substr(\"abcdefg\",2,4);\n+-----------------------+\n| substr(\"abcdefg\",2,4) |\n+-----------------------+\n| \"cdef\" |\n+-----------------------+\nnebula> RETURN substr(\"abcdefg\",0,4);\n+-----------------------+\n| substr(\"abcdefg\",0,4) |\n+-----------------------+\n| \"abcd\" |\n+-----------------------+\nnebula> RETURN substr(\"abcdefg\",2);\n+---------------------+\n| substr(\"abcdefg\",2) |\n+---------------------+\n| \"cdefg\" |\n+---------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#reverse","title":"reverse()","text":"reverse() returns a string in reverse order.
Syntax: reverse(<string>)
string
: A specified string.Example:
nebula> RETURN reverse(\"abcdefg\");\n+--------------------+\n| reverse(\"abcdefg\") |\n+--------------------+\n| \"gfedcba\" |\n+--------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#replace","title":"replace()","text":"replace() replaces string a in a specified string with string b.
Syntax: replace(<string>,<substr_a>,<string_b>)
string
: A specified string.substr_a
: String a.string_b
: String b.Example:
nebula> RETURN replace(\"abcdefg\",\"cd\",\"AAAAA\");\n+---------------------------------+\n| replace(\"abcdefg\",\"cd\",\"AAAAA\") |\n+---------------------------------+\n| \"abAAAAAefg\" |\n+---------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#split","title":"split()","text":"split() splits a specified string at string b and returns a list of strings.
Syntax: split(<string>,<substr>)
string
: A specified string.substr
: String b.Example:
nebula> RETURN split(\"basketballplayer\",\"a\");\n+-------------------------------+\n| split(\"basketballplayer\",\"a\") |\n+-------------------------------+\n| [\"b\", \"sketb\", \"llpl\", \"yer\"] |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#concat","title":"concat()","text":"concat() returns strings concatenated by all strings.
Syntax: concat(<string1>,<string2>,...)
NULL
, NULL
is returned.Example:
//This example concatenates 1, 2, and 3.\nnebula> RETURN concat(\"1\",\"2\",\"3\") AS r;\n+-------+\n| r |\n+-------+\n| \"123\" |\n+-------+\n\n//In this example, one of the string is NULL.\nnebula> RETURN concat(\"1\",\"2\",NULL) AS r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\nnebula> GO FROM \"player100\" over follow \\\n YIELD concat(src(edge), properties($^).age, properties($$).name, properties(edge).degree) AS A;\n+------------------------------+\n| A |\n+------------------------------+\n| \"player10042Tony Parker95\" |\n| \"player10042Manu Ginobili95\" |\n+------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#concat_ws","title":"concat_ws()","text":"concat_ws() returns strings concatenated by all strings that are delimited with a separator.
Syntax: concat_ws(<separator>,<string1>,<string2>,... )
NULL
, the concat_ws()
function returns NULL
.NULL
and there is only one string, the string itself is returned.NULL
in the strings, NULL
is ignored during the concatenation.Example:
//This example concatenates a, b, and c with the separator +.\nnebula> RETURN concat_ws(\"+\",\"a\",\"b\",\"c\") AS r;\n+---------+\n| r |\n+---------+\n| \"a+b+c\" |\n+---------+\n\n//In this example, the separator is NULL.\nneubla> RETURN concat_ws(NULL,\"a\",\"b\",\"c\") AS r;\n+----------+\n| r |\n+----------+\n| __NULL__ |\n+----------+\n\n//In this example, the separator is + and there is a NULL in the strings.\nnebula> RETURN concat_ws(\"+\",\"a\",NULL,\"b\",\"c\") AS r;\n+---------+\n| r |\n+---------+\n| \"a+b+c\" |\n+---------+\n\n//In this example, the separator is + and there is only one string.\nnebula> RETURN concat_ws(\"+\",\"a\") AS r;\n+-----+\n| r |\n+-----+\n| \"a\" |\n+-----+\nnebula> GO FROM \"player100\" over follow \\\n YIELD concat_ws(\" \",src(edge), properties($^).age, properties($$).name, properties(edge).degree) AS A;\n+---------------------------------+\n| A |\n+---------------------------------+\n| \"player100 42 Tony Parker 95\" |\n| \"player100 42 Manu Ginobili 95\" |\n+---------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#extract","title":"extract()","text":"extract() uses regular expression matching to retrieve a single substring or all substrings from a string.
Syntax: extract(<string>,\"<regular_expression>\")
string
: A specified stringregular_expression
: A regular expressionExample:
nebula> MATCH (a:player)-[b:serve]-(c:team{name: \"Lakers\"}) \\\n WHERE a.player.age > 45 \\\n RETURN extract(a.player.name, \"\\\\w+\") AS result;\n+----------------------------+\n| result |\n+----------------------------+\n| [\"Shaquille\", \"O\", \"Neal\"] |\n+----------------------------+\n\nnebula> MATCH (a:player)-[b:serve]-(c:team{name: \"Lakers\"}) \\\n WHERE a.player.age > 45 \\\n RETURN extract(a.player.name, \"hello\") AS result;\n+--------+\n| result |\n+--------+\n| [] |\n+--------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/2.string/#json_extract","title":"json_extract()","text":"json_extract() converts the specified JSON string to a map.
Syntax: extract(<string>)
string
:A specified string, must be JSON string.Caution
Example:
nebula> YIELD json_extract('{\"a\": 1, \"b\": {}, \"c\": {\"d\": true}}') AS result;\n+-----------------------------+\n| result |\n+-----------------------------+\n| {a: 1, b: {}, c: {d: true}} |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/3.date-and-time/","title":"Built-in date and time functions","text":"NebulaGraph supports the following built-in date and time functions:
Function Description int now() Returns the current timestamp of the system. timestamp timestamp() Returns the current timestamp of the system. date date() Returns the current UTC date based on the current system. time time() Returns the current UTC time based on the current system. datetime datetime() Returns the current UTC date and time based on the current system. map duration() Returns the period of time. It can be used to calculate the specified time.For more information, see Date and time types.
"},{"location":"3.ngql-guide/6.functions-and-expressions/3.date-and-time/#examples","title":"Examples","text":"nebula> RETURN now(), timestamp(), date(), time(), datetime();\n+------------+-------------+------------+-----------------+----------------------------+\n| now() | timestamp() | date() | time() | datetime() |\n+------------+-------------+------------+-----------------+----------------------------+\n| 1640057560 | 1640057560 | 2021-12-21 | 03:32:40.351000 | 2021-12-21T03:32:40.351000 |\n+------------+-------------+------------+-----------------+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/","title":"Schema-related functions","text":"This topic describes the schema-related functions supported by NebulaGraph. There are two types of schema-related functions, one for native nGQL statements and the other for openCypher-compatible statements.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#for_ngql_statements","title":"For nGQL statements","text":"The following functions are available in YIELD
and WHERE
clauses of nGQL statements.
Note
Since vertex, edge, vertices, edges, and path are keywords, you need to use AS <alias>
to set the alias, such as GO FROM \"player100\" OVER follow YIELD edge AS e;
.
id(vertex) returns the ID of a vertex.
Syntax: id(vertex)
Example:
nebula> LOOKUP ON player WHERE player.age > 45 YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player144\" |\n| \"player140\" |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#propertiesvertex","title":"properties(vertex)","text":"properties(vertex) returns the properties of a vertex.
Syntax: properties(vertex)
Example:
nebula> LOOKUP ON player WHERE player.age > 45 \\\n YIELD properties(vertex);\n+-------------------------------------+\n| properties(VERTEX) |\n+-------------------------------------+\n| {age: 47, name: \"Shaquille O'Neal\"} |\n| {age: 46, name: \"Grant Hill\"} |\n+-------------------------------------+\n
You can also use the property reference symbols ($^
and $$
) instead of the vertex
field in the properties()
function to get all properties of a vertex.
$^
represents the data of the starting vertex at the beginning of exploration. For example, in GO FROM \"player100\" OVER follow reversely YIELD properties($^)
, $^
refers to the vertex player100
.$$
represents the data of the end vertex at the end of exploration.properties($^)
and properties($$)
are generally used in GO
statements. For more information, see Property reference.
Caution
You can use properties().<property_name>
to get a specific property of a vertex. However, it is not recommended to use this method to obtain specific properties because the properties()
function returns all properties, which can decrease query performance.
properties(edge) returns the properties of an edge.
Syntax: properties(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties(edge);\n+------------------+\n| properties(EDGE) |\n+------------------+\n| {degree: 95} |\n| {degree: 95} |\n+------------------+\n
Caution
You can use properties(edge).<property_name>
to get a specific property of an edge. However, it is not recommended to use this method to obtain specific properties because the properties(edge)
function returns all properties, which can decrease query performance.
type(edge) returns the edge type of an edge.
Syntax: type(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge), type(edge), rank(edge);\n+-------------+-------------+------------+------------+\n| src(EDGE) | dst(EDGE) | type(EDGE) | rank(EDGE) |\n+-------------+-------------+------------+------------+\n| \"player100\" | \"player101\" | \"follow\" | 0 |\n| \"player100\" | \"player125\" | \"follow\" | 0 |\n+-------------+-------------+------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#srcedge","title":"src(edge)","text":"src(edge) returns the source vertex ID of an edge.
Syntax: src(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge);\n+-------------+-------------+\n| src(EDGE) | dst(EDGE) |\n+-------------+-------------+\n| \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" |\n+-------------+-------------+\n
Note
The semantics of the query for the starting vertex with src(edge) and properties($^
) are different. src(edge) indicates the starting vertex ID of the edge in the graph database, while properties($^
) indicates the data of the starting vertex where you start to expand the graph, such as the data of the starting vertex player100
in the above GO statement.
dst(edge) returns the destination vertex ID of an edge.
Syntax: dst(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge);\n+-------------+-------------+\n| src(EDGE) | dst(EDGE) |\n+-------------+-------------+\n| \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" |\n+-------------+-------------+\n
Note
dst(edge) indicates the destination vertex ID of the edge in the graph database.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#rankedge","title":"rank(edge)","text":"rank(edge) returns the rank value of an edge.
Syntax: rank(edge)
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge), dst(edge), rank(edge);\n+-------------+-------------+------------+\n| src(EDGE) | dst(EDGE) | rank(EDGE) |\n+-------------+-------------+------------+\n| \"player100\" | \"player101\" | 0 |\n| \"player100\" | \"player125\" | 0 |\n+-------------+-------------+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#vertex","title":"vertex","text":"vertex returns the information of vertices, including VIDs, tags, properties, and values. You need to use AS <alias>
to set the alias.
Syntax: vertex
Example:
nebula> LOOKUP ON player WHERE player.age > 45 YIELD vertex AS v;\n+----------------------------------------------------------+\n| v |\n+----------------------------------------------------------+\n| (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player140\" :player{age: 46, name: \"Grant Hill\"}) |\n+----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#edge","title":"edge","text":"edge returns the information of edges, including edge types, source vertices, destination vertices, ranks, properties, and values. You need to use AS <alias>
to set the alias.
Syntax: edge
Example:
nebula> GO FROM \"player100\" OVER follow YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#vertices","title":"vertices","text":"vertices returns the information of vertices in a subgraph. For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#edges","title":"edges","text":"edges returns the information of edges in a subgraph. For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#path","title":"path","text":"path returns the information of a path. For more information, see FIND PATH.
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#for_statements_compatible_with_opencypher","title":"For statements compatible with openCypher","text":"The following functions are available in RETURN
and WHERE
clauses of openCypher-compatible statements.
id() returns the ID of a vertex.
Syntax: id(<vertex>)
Example:
nebula> MATCH (v:player) RETURN id(v); \n+-------------+\n| id(v) |\n+-------------+\n| \"player129\" |\n| \"player115\" |\n| \"player106\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#tags_and_labels","title":"tags() and labels()","text":"tags() and labels() return the Tag of a vertex.
Syntax: tags(<vertex>)
, labels(<vertex>)
Example:
nebula> MATCH (v) WHERE id(v) == \"player100\" \\\n RETURN tags(v);\n+------------+\n| tags(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#properties","title":"properties()","text":"properties() returns the properties of a vertex or an edge.
Syntax: properties(<vertex_or_edge>)
Example:
nebula> MATCH (v:player)-[e:follow]-() RETURN properties(v),properties(e);\n+---------------------------------------+---------------+\n| properties(v) | properties(e) |\n+---------------------------------------+---------------+\n| {age: 31, name: \"Stephen Curry\"} | {degree: 90} |\n| {age: 47, name: \"Shaquille O'Neal\"} | {degree: 100} |\n| {age: 34, name: \"LeBron James\"} | {degree: 13} |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#type","title":"type()","text":"type() returns the edge type of an edge.
Syntax: type(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN type(e);\n+----------+\n| type(e) |\n+----------+\n| \"serve\" |\n| \"follow\" |\n| \"follow\" |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#typeid","title":"typeid()","text":"typeid() returns the internal ID value of the Edge type of the edge, which can be used to determine the direction by positive or negative.
Syntax: typeid(<edge>)
Example:
nebula> MATCH (v:player)-[e:follow]-(v2) RETURN e,typeid(e), \\\n CASE WHEN typeid(e) > 0 \\\n THEN \"Forward\" ELSE \"Reverse\" END AS direction \\\n LIMIT 5;\n+----------------------------------------------------+-----------+-----------+\n| e | typeid(e) | direction |\n+----------------------------------------------------+-----------+-----------+\n| [:follow \"player127\"->\"player114\" @0 {degree: 90}] | 5 | \"Forward\" |\n| [:follow \"player127\"->\"player148\" @0 {degree: 70}] | 5 | \"Forward\" |\n| [:follow \"player148\"->\"player127\" @0 {degree: 80}] | -5 | \"Reverse\" |\n| [:follow \"player147\"->\"player136\" @0 {degree: 90}] | 5 | \"Forward\" |\n| [:follow \"player136\"->\"player147\" @0 {degree: 90}] | -5 | \"Reverse\" |\n+----------------------------------------------------+-----------+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#src","title":"src()","text":"src() returns the source vertex ID of an edge.
Syntax: src(<edge>)
Example:
nebula> MATCH ()-[e]->(v:player{name:\"Tim Duncan\"}) \\\n RETURN src(e);\n+-------------+\n| src(e) |\n+-------------+\n| \"player125\" |\n| \"player113\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#dst","title":"dst()","text":"dst() returns the destination vertex ID of an edge.
Syntax: dst(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN dst(e);\n+-------------+\n| dst(e) |\n+-------------+\n| \"team204\" |\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#startnode","title":"startNode()","text":"startNode() visits a path and returns its information of source vertex ID, including VIDs, tags, properties, and values.
Syntax: startNode(<path>)
Example:
nebula> MATCH p = (a :player {name : \"Tim Duncan\"})-[r:serve]-(t) \\\n RETURN startNode(p);\n+----------------------------------------------------+\n| startNode(p) |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#endnode","title":"endNode()","text":"endNode() visits a path and returns its information of destination vertex ID, including VIDs, tags, properties, and values.
Syntax: endNode(<path>)
Example:
nebula> MATCH p = (a :player {name : \"Tim Duncan\"})-[r:serve]-(t) \\\n RETURN endNode(p);\n+----------------------------------+\n| endNode(p) |\n+----------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/4.schema/#rank","title":"rank()","text":"rank() returns the rank value of an edge.
Syntax: rank(<edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN rank(e);\n+---------+\n| rank(e) |\n+---------+\n| 0 |\n| 0 |\n| 0 |\n+---------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/","title":"Conditional expressions","text":"This topic describes the conditional functions supported by NebulaGraph.
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#case","title":"CASE","text":"The CASE
expression uses conditions to filter the parameters. nGQL provides two forms of CASE
expressions just like openCypher: the simple form and the generic form.
The CASE
expression will traverse all the conditions. When the first condition is met, the CASE
expression stops reading the conditions and returns the result. If no conditions are met, it returns the result in the ELSE
clause. If there is no ELSE
clause and no conditions are met, it returns NULL
.
CASE <comparer>\nWHEN <value> THEN <result>\n[WHEN ...]\n[ELSE <default>]\nEND\n
Caution
Always remember to end the CASE
expression with an END
.
comparer
A value or a valid expression that outputs a value. This value is used to compare with the value
. value
It will be compared with the comparer
. If the value
matches the comparer
, then this condition is met. result
The result
is returned by the CASE
expression if the value
matches the comparer
. default
The default
is returned by the CASE
expression if no conditions are met. nebula> RETURN \\\n CASE 2+3 \\\n WHEN 4 THEN 0 \\\n WHEN 5 THEN 1 \\\n ELSE -1 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Name, \\\n CASE properties($$).age > 35 \\\n WHEN true THEN \"Yes\" \\\n WHEN false THEN \"No\" \\\n ELSE \"Nah\" \\\n END \\\n AS Age_above_35;\n+-----------------+--------------+\n| Name | Age_above_35 |\n+-----------------+--------------+\n| \"Tony Parker\" | \"Yes\" |\n| \"Manu Ginobili\" | \"Yes\" |\n+-----------------+--------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#the_generic_form_of_case_expressions","title":"The generic form of CASE expressions","text":"CASE\nWHEN <condition> THEN <result>\n[WHEN ...]\n[ELSE <default>]\nEND\n
Parameter Description condition
If the condition
is evaluated as true, the result
is returned by the CASE
expression. result
The result
is returned by the CASE
expression if the condition
is evaluated as true. default
The default
is returned by the CASE
expression if no conditions are met. nebula> YIELD \\\n CASE WHEN 4 > 5 THEN 0 \\\n WHEN 3+4==7 THEN 1 \\\n ELSE 2 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n
nebula> MATCH (v:player) WHERE v.player.age > 30 \\\n RETURN v.player.name AS Name, \\\n CASE \\\n WHEN v.player.name STARTS WITH \"T\" THEN \"Yes\" \\\n ELSE \"No\" \\\n END \\\n AS Starts_with_T;\n+---------------------+---------------+\n| Name | Starts_with_T |\n+---------------------+---------------+\n| \"Tim Duncan\" | \"Yes\" |\n| \"LaMarcus Aldridge\" | \"No\" |\n| \"Tony Parker\" | \"Yes\" |\n+---------------------+---------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/5.conditional-expressions/#differences_between_the_simple_form_and_the_generic_form","title":"Differences between the simple form and the generic form","text":"To avoid the misuse of the simple form and the generic form, it is important to understand their differences. The following example can help explain them.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Name, properties($$).age AS Age, \\\n CASE properties($$).age \\\n WHEN properties($$).age > 35 THEN \"Yes\" \\\n ELSE \"No\" \\\n END \\\n AS Age_above_35;\n+-----------------+-----+--------------+\n| Name | Age | Age_above_35 |\n+-----------------+-----+--------------+\n| \"Tony Parker\" | 36 | \"No\" |\n| \"Manu Ginobili\" | 41 | \"No\" |\n+-----------------+-----+--------------+\n
The preceding GO
query is intended to output Yes
when the player's age is above 35. However, in this example, when the player's age is 36, the actual output is not as expected: It is No
instead of Yes
.
This is because the query uses the CASE
expression in the simple form, and a comparison between the values of $$.player.age
and $$.player.age > 35
is made. When the player age is 36:
$$.player.age
is 36
. It is an integer.$$.player.age > 35
is evaluated to be true
. It is a boolean.The values of $$.player.age
and $$.player.age > 35
do not match. Therefore, the condition is not met and No
is returned.
coalesce() returns the first not null value in all expressions.
Syntax: coalesce(<expression_1>[,<expression_2>...])
Example:
nebula> RETURN coalesce(null,[1,2,3]) as result;\n+-----------+\n| result |\n+-----------+\n| [1, 2, 3] |\n+-----------+\nnebula> RETURN coalesce(null) as result;\n+----------+\n| result |\n+----------+\n| __NULL__ |\n+----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/","title":"List functions","text":"This topic describes the list functions supported by NebulaGraph. Some of the functions have different syntax in native nGQL statements and openCypher-compatible statements.
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#precautions","title":"Precautions","text":"Like SQL, the position index in nGQL starts from 1
, while in the C language it starts from 0
.
range() returns the list containing all the fixed-length steps in [start,end]
.
Syntax: range(start, end [, step])
step
: Optional parameters. step
is 1 by default.Example:
nebula> RETURN range(1,9,2);\n+-----------------+\n| range(1,9,2) |\n+-----------------+\n| [1, 3, 5, 7, 9] |\n+-----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#reverse","title":"reverse()","text":"reverse() returns the list reversing the order of all elements in the original list.
Syntax: reverse(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN reverse(ids);\n+-----------------------------------+\n| reverse(ids) |\n+-----------------------------------+\n| [487, 521, \"abc\", 4923, __NULL__] |\n+-----------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#tail","title":"tail()","text":"tail() returns all the elements of the original list, excluding the first one.
Syntax: tail(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN tail(ids);\n+-------------------------+\n| tail(ids) |\n+-------------------------+\n| [4923, \"abc\", 521, 487] |\n+-------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#head","title":"head()","text":"head() returns the first element of a list.
Syntax: head(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN head(ids);\n+-----------+\n| head(ids) |\n+-----------+\n| __NULL__ |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#last","title":"last()","text":"last() returns the last element of a list.
Syntax: last(<list>)
Example:
nebula> WITH [NULL, 4923, 'abc', 521, 487] AS ids \\\n RETURN last(ids);\n+-----------+\n| last(ids) |\n+-----------+\n| 487 |\n+-----------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#reduce","title":"reduce()","text":"reduce() applies an expression to each element in a list one by one, chains the result to the next iteration by taking it as the initial value, and returns the final result. This function iterates each element e
in the given list, runs the expression on e
, accumulates the result with the initial value, and store the new result in the accumulator as the initial value of the next iteration. It works like the fold or reduce method in functional languages such as Lisp and Scala.
openCypher compatibility
In openCypher, the reduce()
function is not defined. nGQL will implement the reduce()
function in the Cypher way.
Syntax: reduce(<accumulator> = <initial>, <variable> IN <list> | <expression>)
accumulator
: A variable that will hold the accumulated results as the list is iterated.initial
: An expression that runs once to give an initial value to the accumulator
.variable
: A variable in the list that will be applied to the expression successively.list
: A list or a list of expressions.expression
: This expression will be run on each element in the list once and store the result value in the accumulator
.Example:
nebula> RETURN reduce(totalNum = -4 * 5, n IN [1, 2] | totalNum + n * 2) AS r;\n+-----+\n| r |\n+-----+\n| -14 |\n+-----+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})<-[:follow]-(m) \\\n RETURN nodes(p)[0].player.age AS src1, nodes(p)[1].player.age AS dst2, \\\n reduce(totalAge = 100, n IN nodes(p) | totalAge + n.player.age) AS sum;\n+------+------+-----+\n| src1 | dst2 | sum |\n+------+------+-----+\n| 34 | 31 | 165 |\n| 34 | 29 | 163 |\n| 34 | 33 | 167 |\n| 34 | 26 | 160 |\n| 34 | 34 | 168 |\n| 34 | 37 | 171 |\n+------+------+-----+\n\nnebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD id(vertex) AS VertexID \\\n | GO FROM $-.VertexID over follow \\\n WHERE properties(edge).degree != reduce(totalNum = 5, n IN range(1, 3) | properties($$).age + totalNum + n) \\\n YIELD properties($$).name AS id, properties($$).age AS age, properties(edge).degree AS degree;\n+---------------------+-----+--------+\n| id | age | degree |\n+---------------------+-----+--------+\n| \"Tim Duncan\" | 42 | 95 |\n| \"LaMarcus Aldridge\" | 33 | 90 |\n| \"Manu Ginobili\" | 41 | 95 |\n+---------------------+-----+--------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#for_ngql_statements","title":"For nGQL statements","text":""},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#keys","title":"keys()","text":"keys() returns a list containing the string representations for all the property names of vertices or edges.
Syntax: keys({vertex | edge})
Example:
nebula> LOOKUP ON player \\\n WHERE player.age > 45 \\\n YIELD keys(vertex);\n+-----------------+\n| keys(VERTEX) |\n+-----------------+\n| [\"age\", \"name\"] |\n| [\"age\", \"name\"] |\n+-----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#labels","title":"labels()","text":"labels() returns the list containing all the tags of a vertex.
Syntax: labels(verte)
Example:
nebula> FETCH PROP ON * \"player101\", \"player102\", \"team204\" \\\n YIELD labels(vertex);\n+----------------+\n| labels(VERTEX) |\n+----------------+\n| [\"player\"] |\n| [\"player\"] |\n| [\"team\"] |\n+----------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#for_statements_compatible_with_opencypher","title":"For statements compatible with openCypher","text":""},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#keys_1","title":"keys()","text":"keys() returns a list containing the string representations for all the property names of vertices, edges, or maps.
Syntax: keys(<vertex_or_edge>)
Example:
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN keys(e);\n+----------------------------+\n| keys(e) |\n+----------------------------+\n| [\"end_year\", \"start_year\"] |\n| [\"degree\"] |\n| [\"degree\"] |\n+----------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#labels_1","title":"labels()","text":"labels() returns the list containing all the tags of a vertex.
Syntax: labels(<vertex>)
Example:
nebula> MATCH (v)-[e:serve]->() \\\n WHERE id(v)==\"player100\" \\\n RETURN labels(v);\n+------------+\n| labels(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#nodes","title":"nodes()","text":"nodes() returns the list containing all the vertices in a path.
Syntax: nodes(<path>)
Example:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN nodes(p);\n+-------------------------------------------------------------------------------------------------------------+\n| nodes(p) |\n+-------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"team204\" :team{name: \"Spurs\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player125\" :player{age: 41, name: \"Manu Ginobili\"})] |\n+-------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/6.list/#relationships","title":"relationships()","text":"relationships() returns the list containing all the relationships in a path.
Syntax: relationships(<path>)
Example:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN relationships(p);\n+-------------------------------------------------------------------------+\n| relationships(p) |\n+-------------------------------------------------------------------------+\n| [[:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}]] |\n| [[:follow \"player100\"->\"player101\" @0 {degree: 95}]] |\n| [[:follow \"player100\"->\"player125\" @0 {degree: 95}]] |\n+-------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/8.predicate/","title":"Predicate functions","text":"Predicate functions return true
or false
. They are most commonly used in WHERE
clauses.
NebulaGraph supports the following predicate functions:
Functions Description exists() Returnstrue
if the specified property exists in the vertex, edge or map. Otherwise, returns false
. any() Returns true
if the specified predicate holds for at least one element in the given list. Otherwise, returns false
. all() Returns true
if the specified predicate holds for all elements in the given list. Otherwise, returns false
. none() Returns true
if the specified predicate holds for no element in the given list. Otherwise, returns false
. single() Returns true
if the specified predicate holds for exactly one of the elements in the given list. Otherwise, returns false
. Note
NULL is returned if the list is NULL or all of its elements are NULL.
Compatibility
In openCypher, only function exists()
is defined and specified. The other functions are implement-dependent.
<predicate>(<variable> IN <list> WHERE <condition>)\n
"},{"location":"3.ngql-guide/6.functions-and-expressions/8.predicate/#examples","title":"Examples","text":"nebula> RETURN any(n IN [1, 2, 3, 4, 5, NULL] \\\n WHERE n > 2) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> RETURN single(n IN range(1, 5) \\\n WHERE n == 3) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> RETURN none(n IN range(1, 3) \\\n WHERE n == 0) AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> WITH [1, 2, 3, 4, 5, NULL] AS a \\\n RETURN any(n IN a WHERE n > 2);\n+-------------------------+\n| any(n IN a WHERE (n>2)) |\n+-------------------------+\n| true |\n+-------------------------+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})<-[:follow]-(m) \\\n RETURN nodes(p)[0].player.name AS n1, nodes(p)[1].player.name AS n2, \\\n all(n IN nodes(p) WHERE n.player.name NOT STARTS WITH \"D\") AS b;\n+----------------+-------------------+-------+\n| n1 | n2 | b |\n+----------------+-------------------+-------+\n| \"LeBron James\" | \"Danny Green\" | false |\n| \"LeBron James\" | \"Dejounte Murray\" | false |\n| \"LeBron James\" | \"Chris Paul\" | true |\n| \"LeBron James\" | \"Kyrie Irving\" | true |\n| \"LeBron James\" | \"Carmelo Anthony\" | true |\n| \"LeBron James\" | \"Dwyane Wade\" | false |\n+----------------+-------------------+-------+\n\nnebula> MATCH p = (n:player{name:\"LeBron James\"})-[:follow]->(m) \\\n RETURN single(n IN nodes(p) WHERE n.player.age > 40) AS b;\n+------+\n| b |\n+------+\n| true |\n+------+\n\nnebula> MATCH (n:player) \\\n RETURN exists(n.player.id), n IS NOT NULL;\n+---------------------+---------------+\n| exists(n.player.id) | n IS NOT NULL |\n+---------------------+---------------+\n| false | true |\n...\n\nnebula> MATCH (n:player) \\\n WHERE exists(n['name']) RETURN n;\n+-------------------------------------------------------------------------------------------------------------+\n| n |\n+-------------------------------------------------------------------------------------------------------------+\n| (\"Grant Hill\" :player{age: 46, name: \"Grant Hill\"}) |\n| (\"Marc Gasol\" :player{age: 34, name: \"Marc Gasol\"}) |\n+-------------------------------------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/","title":"Overview of NebulaGraph general query statements","text":"This topic provides an overview of the general categories of query statements in NebulaGraph and outlines their use cases.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#background","title":"Background","text":"NebulaGraph stores data in the form of vertices and edges. Each vertex can have zero or more tags and each edge has exactly one edge type. Tags define the type of a vertex and describe its properties, while edge types define the type of an edge and describe its properties. When querying, you can limit the scope of the query by specifying the tag of a vertex or the type of an edge. For more information, see Patterns.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#categories","title":"Categories","text":"The primary query statements in NebulaGraph fall into the following categories:
FETCH PROP ON
and LOOKUP ON
statements are primarily for basic data queries, GO
and MATCH
for more intricate queries and graph traversals, FIND PATH
and GET SUBGRAPH
for path and subgraph queries, and SHOW
for retrieving database metadata.
Usage: Retrieve properties of a specified vertex or edge.
Use case: Knowing the specific vertex or edge ID and wanting to retrieve its properties.
Note:
YIELD
clause to specify the returned properties.Example:
FETCH PROP ON player \"player100\" YIELD properties(vertex);\n --+--- ----+----- -------+----------\n | | |\n | | |\n | | +--------- Returns all properties under the player tag of the vertex.\n | |\n | +----------------- Retrieves from the vertex \"player100\".\n |\n +--------------------------- Retrieves properties under the player tag.\n
For more information, see FETCH PROP ON.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#lookup_on","title":"LOOKUP ON","text":"Usage: Index-based querying of vertex or edge IDs.
Use case: Finding vertex or edge IDs based on property values.
Note: - Must pre-define indexes for the tag, edge type, or property. - Must specify the tag of the vertex or the edge type of the edge. - Must use the YIELD
clause to specify the returned IDs.
Example:
LOOKUP ON player WHERE player.name == \"Tony Parker\" YIELD id(vertex);\n --+--- ------------------+--------------- ---+------\n | | |\n | | |\n | | +---- Returns the VID of the retrieved vertex.\n | |\n | +------------ Filtering is based on the value of the property name.\n |\n +----------------------------------- Queries based on the player tag.\n
For more information, see LOOKUP ON.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#go","title":"GO","text":"Usage: Traverse the graph based on a given vertex and return information about the starting vertex, edges, or target vertices as needed. Use case: Complex graph traversals, such as finding friends of a vertex, friends' friends, etc.
Note: - Use property reference symbols ($^
and $$
) to return properties of the starting or target vertices, e.g., YIELD $^.player.name
. - Use the functions properties($^)
and properties($$)
to return all properties of the starting or target vertices. Specify property names in the function to return specific properties, e.g., YIELD properties($^).name
. - Use the functions src(edge)
and dst(edge)
to return the starting or destination vertex ID of an edge, e.g., YIELD src(edge)
.
Example:
GO 3 STEPS FROM \"player102\" OVER follow YIELD dst(edge);\n-----+--- --+------- -+---- ---+-----\n | | | |\n | | | |\n | | | +--------- Returns the destination vertex of the last hop.\n | | |\n | | +------ Traverses out via the edge follow.\n | |\n | +--------------------- Starts from \"player102\".\n |\n +---------------------------------- Traverses 3 steps.\n
For more information, see GO.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#match","title":"MATCH","text":"Usage: Execute complex graph pattern matching queries.
Use case: Complex graph pattern matching, such as finding combinations of vertices and edges that satisfy a specific pattern.
Note:
MATCH
statements are compatible with the OpenCypher syntax but with some differences:
==
for equality instead of =
, e.g., WHERE player.name == \"Tony Parker\"
.YIELD player.name
.WHERE id(v) == \"player100\"
syntax.RETURN
clause to specify what information to return.Example:
MATCH (v:player{name:\"Tim Duncan\"})-->(v2:player) \\\n RETURN v2.player.name AS Name;\n
For more information, see MATCH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#find_path","title":"FIND PATH","text":"Usage: Query paths between given starting and target vertices or query properties of vertices and edges along paths.
Use case: Querying paths between two vertices.
Note: Must use the YIELD
clause to specify returned information.
Example:
FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path AS p;\n-------+----- -------+---------------- ---+-- ----+----\n | | | |\n | | | |\n | | | +---------- Returns the path as 'p'.\n | | |\n | | +----------- Travels outwards via all types of edges.\n | | \n | |\n | +------------------ From the given starting and target VIDs. \n |\n +--------------------------- Retrieves the shortest path.\n
For more information, see FIND PATH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#get_subgraph","title":"GET SUBGRAPH","text":"Usage: Extract a portion of the graph that satisfies specific conditions or query properties of vertices and edges in the subgraph.
Use case: Analyzing structures of the graph or specific regions, such as extracting the social network subgraph of a person or the transportation network subgraph of an area.
Note: Must use the YIELD
clause to specify returned information.
Example:
GET SUBGRAPH 5 STEPS FROM \"player101\" YIELD VERTICES AS nodes, EDGES AS relationships;\n -----+- -----+-------- ------------------------+----------------\n | | |\n | | |\n | +------- Starts from \"player101\". +------------ Returns all vertices and edges.\n |\n +----------------- Gets exploration of 5 steps \n
For more information, see GET SUBGRAPH.
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#show","title":"SHOW","text":"SHOW
statements are mainly used to obtain metadata information from the database, not for retrieving the actual data stored in the database. These statements are typically used to query the structure and configuration of the database.
SHOW CHARSET
SHOW CHARSET
Shows the available character sets. SHOW COLLATION SHOW COLLATION
SHOW COLLATION
Shows the collations supported by NebulaGraph. SHOW CREATE SPACE SHOW CREATE SPACE <space_name>
SHOW CREATE SPACE basketballplayer
Shows the creating statement of the specified graph space. SHOW CREATE TAG/EDGE SHOW CREATE {TAG <tag_name> | EDGE <edge_name>}
SHOW CREATE TAG player
Shows the basic information of the specified tag. SHOW HOSTS SHOW HOSTS [GRAPH | STORAGE | META]
SHOW HOSTS
SHOW HOSTS GRAPH
Shows the host and version information of Graph Service, Storage Service, and Meta Service. SHOW INDEX STATUS SHOW {TAG | EDGE} INDEX STATUS
SHOW TAG INDEX STATUS
Shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not. SHOW INDEXES SHOW {TAG | EDGE} INDEXES
SHOW TAG INDEXES
Shows the names of existing native indexes. SHOW PARTS SHOW PARTS [<part_id>]
SHOW PARTS
Shows the information of a specified partition or all partitions in a graph space. SHOW ROLES SHOW ROLES IN <space_name>
SHOW ROLES in basketballplayer
Shows the roles that are assigned to a user account. SHOW SNAPSHOTS SHOW SNAPSHOTS
SHOW SNAPSHOTS
Shows the information of all the snapshots. SHOW SPACES SHOW SPACES
SHOW SPACES
Shows existing graph spaces in NebulaGraph. SHOW STATS SHOW STATS
SHOW STATS
Shows the statistics of the graph space collected by the latest STATS
job. SHOW TAGS/EDGES SHOW TAGS | EDGES
SHOW TAGS
,SHOW EDGES
Shows all the tags in the current graph space. SHOW USERS SHOW USERS
SHOW USERS
Shows the user information. SHOW SESSIONS SHOW SESSIONS
SHOW SESSIONS
Shows the information of all the sessions. SHOW SESSIONS SHOW SESSION <Session_Id>
SHOW SESSION 1623304491050858
Shows a specified session with its ID. SHOW QUERIES SHOW [ALL] QUERIES
SHOW QUERIES
Shows the information of working queries in the current session. SHOW META LEADER SHOW META LEADER
SHOW META LEADER
Shows the information of the leader in the current Meta cluster."},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#compound_queries","title":"Compound queries","text":"Query statements in NebulaGraph can be combined to achieve more complex queries.
When referencing the results of a subquery in a compound statement, you need to create an alias for the result and use the pipe symbol(|
) to pass it to the next subquery. Use $-
in the next subquery to reference the alias of that result. See Pipe Symbol for details.
Example:
nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dstid, properties($$).name AS Name | \\\n GO FROM $-.dstid OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n| \"player100\" |\n+-------------+\n
The pipe symbol |
is applicable only in nGQL and cannot be used in OpenCypher statements. If you need to perform compound queries using MATCH
statements, you can use the WITH clause.
Example:
nebula> MATCH (v:player)-->(v2:player) \\\n WITH DISTINCT v2 AS v2, v2.player.age AS Age \\\n ORDER BY Age \\\n WHERE Age<25 \\\n RETURN v2.player.name AS Name, Age;\n+----------------------+-----+\n| Name | Age |\n+----------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+-----+\n
"},{"location":"3.ngql-guide/7.general-query-statements/1.general-query-statements-overview/#more_information","title":"More information","text":"nGQL command cheatsheet
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/","title":"MATCH","text":"The MATCH
statement provides pattern-based search functionality, allowing you to retrieve data that matches one or more patterns in NebulaGraph. By defining one or more patterns, you can search for data that matches the patterns in NebulaGraph. Once the matching data is retrieved, you can use the RETURN
clause to return it as a result.
The examples in this topic use the basketballplayer dataset as the sample dataset.
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#syntax","title":"Syntax","text":"The syntax of MATCH
is relatively more flexible compared with that of other query statements such as GO
or LOOKUP
. The path type of the MATCH
statement is trail
. That is, only vertices can be repeatedly visited in the graph traversal. Edges cannot be repeatedly visited. For details, see path. But generally, it can be summarized as follows.
MATCH <pattern> [<clause_1>] RETURN <output> [<clause_2>];\n
pattern
: The MATCH
statement supports matching one or multiple patterns. Multiple patterns are separated by commas (,). For example: (a)-[]->(b),(c)-[]->(d)
. For the detailed description of patterns, see Patterns. clause_1
: The WHERE
, WITH
, UNWIND
, and OPTIONAL MATCH
clauses are supported, and the MATCH
clause can also be used.output
: Define the list name for the output results to be returned. You can use AS
to set an alias for the list.clause_2
: The ORDER BY
and LIMIT
clauses are supported.Legacy version compatibility
MATCH
statement supports full table scans. It can traverse vertices or edges in the graph without using any indexes or filter conditions. In previous versions, the MATCH
statement required an index for certain queries or needed to use LIMIT
to restrict the number of output results.RETURN <variable_name>.<property_name>
is changed to RETURN <variable_name>.<tag_name>.<property_name>
.v:player
and v.player.name
in the statement MATCH (v:player) RETURN v.player.name AS Name
.player
tag or the name property of the player
tag. For more information about the usage and considerations for indexes, see Must-read for using indexes.MATCH
statement cannot query dangling edges.You can use a user-defined variable in a pair of parentheses to represent a vertex in a pattern. For example: (v)
.
nebula> MATCH (v) \\\n RETURN v \\\n LIMIT 3;\n+-----------------------------------------------------------+\n| v |\n+-----------------------------------------------------------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n| (\"player106\" :player{age: 25, name: \"Kyle Anderson\"}) |\n| (\"player115\" :player{age: 40, name: \"Kobe Bryant\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_tags","title":"Match tags","text":"Legacy version compatibility
LIMIT
to restrict the number of output results.MATCH
statement supports full table scans. There is no need to create an index for a tag or a specific property of a tag, nor use LIMIT
to restrict the number of output results in order to execute the MATCH
statement.You can specify a tag with :<tag_name>
after the vertex in a pattern.
nebula> MATCH (v:player) \\\n RETURN v;\n+---------------------------------------------------------------+\n| v |\n+---------------------------------------------------------------+\n| (\"player105\" :player{age: 31, name: \"Danny Green\"}) |\n| (\"player109\" :player{age: 34, name: \"Tiago Splitter\"}) |\n| (\"player111\" :player{age: 38, name: \"David West\"}) |\n...\n
To match vertices with multiple tags, use colons (:).
nebula> CREATE TAG actor (name string, age int);\nnebula> INSERT VERTEX actor(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\nnebula> MATCH (v:player:actor) \\\n RETURN v \\\n+----------------------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------------------+\n| (\"player100\" :actor{age: 42, name: \"Tim Duncan\"} :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_vertex_properties","title":"Match vertex properties","text":"Note
The prerequisite for matching a vertex property is that the tag itself has an index of the corresponding property. Otherwise, you cannot execute the MATCH
statement to match the property.
You can specify a vertex property with {<prop_name>: <prop_value>}
after the tag in a pattern.
# The following example uses the name property to match a vertex.\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
The WHERE
clause can do the same thing:
nebula> MATCH (v:player) \\\n WHERE v.player.name == \"Tim Duncan\" \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
OpenCypher compatibility
In openCypher 9, =
is the equality operator. However, in nGQL, ==
is the equality operator and =
is the assignment operator (as in C++ or Java).
Use the WHERE
clause to directly get all the vertices with the vertex property value Tim Duncan.
nebula> MATCH (v) \\\n WITH v, properties(v) as props, keys(properties(v)) as kk \\\n WHERE [i in kk where props[i] == \"Tim Duncan\"] \\\n RETURN v;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n\nnebula> WITH ['Tim Duncan', 'Yao Ming'] AS names \\\n MATCH (v1:player)-->(v2:player) \\\n WHERE v1.player.name in names \\\n return v1, v2;\n+----------------------------------------------------+----------------------------------------------------------+\n| v1 | v2 |\n+----------------------------------------------------+----------------------------------------------------------+\n| (\"player133\" :player{age: 38, name: \"Yao Ming\"}) | (\"player114\" :player{age: 39, name: \"Tracy McGrady\"}) |\n| (\"player133\" :player{age: 38, name: \"Yao Ming\"}) | (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n+----------------------------------------------------+----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_vids","title":"Match VIDs","text":"You can use the VID to match a vertex. The id()
function can retrieve the VID of a vertex.
nebula> MATCH (v) \\\n WHERE id(v) == 'player101' \\\n RETURN v;\n+-----------------------------------------------------+\n| v |\n+-----------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n+-----------------------------------------------------+\n
To match multiple VIDs, use WHERE id(v) IN [vid_list]
or WHERE id(v) IN {vid_list}
.
nebula> MATCH (v:player { name: 'Tim Duncan' })--(v2) \\\n WHERE id(v2) IN [\"player101\", \"player102\"] \\\n RETURN v2;\n+-----------------------------------------------------------+\n| v2 |\n+-----------------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n\nnebula> MATCH (v) WHERE id(v) IN {\"player100\", \"player101\"} \\\n RETURN v.player.name AS name;\n+---------------+\n| name |\n+---------------+\n| \"Tony Parker\" |\n| \"Tim Duncan\" |\n+---------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_connected_vertices","title":"Match connected vertices","text":"You can use the --
symbol to represent edges of both directions and match vertices connected by these edges.
Legacy version compatibility
In nGQL 1.x, the --
symbol is used for inline comments. Starting from nGQL 2.x, the --
symbol represents an incoming or outgoing edge.
nebula> MATCH (v:player{name:\"Tim Duncan\"})--(v2) \\\n RETURN v2.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Manu Ginobili\" |\n| \"Manu Ginobili\" |\n| \"Tiago Splitter\" |\n...\n
You can add a >
or <
to the --
symbol to specify the direction of an edge.
In the following example, -->
represents an edge that starts from v
and points to v2
. To v
, this is an outgoing edge, and to v2
this is an incoming edge.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->(v2:player) \\\n RETURN v2.player.name AS Name;\n+-----------------+\n| Name |\n+-----------------+\n| \"Manu Ginobili\" |\n| \"Tony Parker\" |\n+-----------------+\n
To query the properties of the target vertices, use the CASE
expression.
nebula> MATCH (v:player{name:\"Tim Duncan\"})--(v2) \\\n RETURN \\\n CASE WHEN v2.team.name IS NOT NULL \\\n THEN v2.team.name \\\n WHEN v2.player.name IS NOT NULL \\\n THEN v2.player.name END AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Manu Ginobili\" |\n| \"Manu Ginobili\" |\n| \"Spurs\" |\n| \"Dejounte Murray\" |\n...\n
To extend the pattern, you can add more vertices and edges.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->(v2)<--(v3) \\\n RETURN v3.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Dejounte Murray\" |\n| \"LaMarcus Aldridge\" |\n| \"Marco Belinelli\" |\n...\n
If you do not need to refer to a vertex, you can omit the variable representing it in the parentheses.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-->()<--(v3) \\\n RETURN v3.player.name AS Name;\n+---------------------+\n| Name |\n+---------------------+\n| \"Dejounte Murray\" |\n| \"LaMarcus Aldridge\" |\n| \"Marco Belinelli\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_paths","title":"Match paths","text":"Connected vertices and edges form a path. You can use a user-defined variable to name a path as follows.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-->(v2) \\\n RETURN p;\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n
OpenCypher compatibility
In nGQL, the @
symbol represents the rank of an edge, but openCypher has no such concept.
nebula> MATCH ()<-[e]-() \\\n RETURN e \\\n LIMIT 3;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player103\"->\"player102\" @0 {degree: 70}] |\n| [:follow \"player135\"->\"player102\" @0 {degree: 80}] |\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_edge_types","title":"Match edge types","text":"Just like vertices, you can specify edge types with :<edge_type>
in a pattern. For example: -[e:follow]-
.
OpenCypher compatibility
LIMIT
to limit the number of output results and you must specify the direction of the edge.MATCH
statement to match edges without creating an index for edge type or using LIMIT
to restrict the number of output results.nebula> MATCH ()-[e:follow]->() \\\n RETURN e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player102\"->\"player100\" @0 {degree: 75}] |\n| [:follow \"player102\"->\"player101\" @0 {degree: 75}] |\n| [:follow \"player129\"->\"player116\" @0 {degree: 90}] |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_edge_type_properties","title":"Match edge type properties","text":"Note
The prerequisite for matching an edge type property is that the edge type itself has an index of the corresponding property. Otherwise, you cannot execute the MATCH
statement to match the property.
You can specify edge type properties with {<prop_name>: <prop_value>}
in a pattern. For example: [e:follow{likeness:95}]
.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e:follow{degree:95}]->(v2) \\\n RETURN e;\n+--------------------------------------------------------+\n| e |\n+--------------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+--------------------------------------------------------+\n
Use the WHERE
clause to directly get all the edges with the edge property value 90.
nebula> MATCH ()-[e]->() \\\n WITH e, properties(e) as props, keys(properties(e)) as kk \\\n WHERE [i in kk where props[i] == 90] \\\n RETURN e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player125\"->\"player100\" @0 {degree: 90}] |\n| [:follow \"player140\"->\"player114\" @0 {degree: 90}] |\n| [:follow \"player133\"->\"player144\" @0 {degree: 90}] |\n| [:follow \"player133\"->\"player114\" @0 {degree: 90}] |\n...\n+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_edge_types","title":"Match multiple edge types","text":"The |
symbol can help matching multiple edge types. For example: [e:follow|:serve]
. The English colon (:) before the first edge type cannot be omitted, but the English colon before the subsequent edge type can be omitted, such as [e:follow|serve]
.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e:follow|:serve]->(v2) \\\n RETURN e;\n+---------------------------------------------------------------------------+\n| e |\n+---------------------------------------------------------------------------+\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n+---------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_edges","title":"Match multiple edges","text":"You can extend a pattern to match multiple edges in a path.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[]->(v2)<-[e:serve]-(v3) \\\n RETURN v2, v3;\n+----------------------------------+-----------------------------------------------------------+\n| v2 | v3 |\n+----------------------------------+-----------------------------------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) | (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_fixed-length_paths","title":"Match fixed-length paths","text":"You can use the :<edge_type>*<hop>
pattern to match a fixed-length path. hop
must be a non-negative integer.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n RETURN DISTINCT v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n
If hop
is 0, the pattern will match the source vertex of the path.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) -[*0]-> (v2) \\\n RETURN v2;\n+----------------------------------------------------+\n| v2 |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n
Note
When you conditionally filter on multi-hop edges, such as -[e:follow*2]->
, note that the e
is a list of edges instead of a single edge.
For example, the following statement is correct from the syntax point of view which may not get your expected query result, because the e
is a list without the .degree
property.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE e.degree > 1 \\\n RETURN DISTINCT v2 AS Friends;\n
The correct statement is as follows:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE ALL(e_ in e WHERE e_.degree > 0) \\\n RETURN DISTINCT v2 AS Friends;\n
Further, the following statement is for filtering the properties of the first-hop edge in multi-hop edges:
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*2]->(v2) \\\n WHERE e[0].degree > 98 \\\n RETURN DISTINCT v2 AS Friends;\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_variable-length_paths","title":"Match variable-length paths","text":"You can use the :<edge_type>*[minHop..maxHop]
pattern to match variable-length paths.minHop
and maxHop
are optional and default to 1 and infinity respectively.
Caution
If maxHop
is not set, it may cause the Graph service to OOM. Execute this command with caution.
minHop
Optional. minHop
indicates the minimum length of the path, which must be a non-negative integer. The default value is 1. maxHop
Optional. maxHop
indicates the maximum length of the path, which must be a non-negative integer. The default value is infinity. If neither minHop
nor maxHop
is specified, and only :<edge_type>*
is set, the default values are applied to both, i.e., minHop
is 1 and maxHop
is infinity.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n...\n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n...\n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..]->(v2) \\\n RETURN v2 AS Friends;\n+-----------------------------------------------------------+\n| Friends |\n+-----------------------------------------------------------+\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n...\n
You can use the DISTINCT
keyword to aggregate duplicate results.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*1..3]->(v2:player) \\\n RETURN DISTINCT v2 AS Friends, count(v2);\n+-----------------------------------------------------------+-----------+\n| Friends | count(v2) |\n+-----------------------------------------------------------+-----------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) | 1 |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | 4 |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | 3 |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) | 3 |\n+-----------------------------------------------------------+-----------+\n
If minHop
is 0
, the pattern will match the source vertex of the path. Compared to the preceding statement, the following example uses 0
as the minHop
. So in the following result set, \"Tim Duncan\"
is counted one more time than it is in the preceding result set because it is the source vertex.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow*0..3]->(v2:player) \\\n RETURN DISTINCT v2 AS Friends, count(v2);\n+-----------------------------------------------------------+-----------+\n| Friends | count(v2) |\n+-----------------------------------------------------------+-----------+\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) | 1 |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | 5 |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) | 3 |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | 3 |\n+-----------------------------------------------------------+-----------+\n
Note
When using the variable e
to match fixed-length or variable-length paths in a pattern, such as -[e:follow*0..3]->
, it is not supported to reference e
in other patterns. For example, the following statement is not supported.
nebula> MATCH (v:player)-[e:like*1..3]->(n) \\\n WHERE (n)-[e*1..4]->(:player) \\\n RETURN v;\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_variable-length_paths_with_multiple_edge_types","title":"Match variable-length paths with multiple edge types","text":"You can specify multiple edge types in a fixed-length or variable-length pattern. In this case, hop
, minHop
, and maxHop
take effect on all edge types.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e:follow|serve*2]->(v2) \\\n RETURN DISTINCT v2;\n+-----------------------------------------------------------+\n| v2 |\n+-----------------------------------------------------------+\n| (\"team204\" :team{name: \"Spurs\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"team215\" :team{name: \"Hornets\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_multiple_patterns","title":"Match multiple patterns","text":"You can separate multiple patterns with commas (,).
nebula> CREATE TAG INDEX IF NOT EXISTS team_index ON team(name(20));\nnebula> REBUILD TAG INDEX team_index;\nnebula> MATCH (v1:player{name:\"Tim Duncan\"}), (v2:team{name:\"Spurs\"}) \\\n RETURN v1,v2;\n+----------------------------------------------------+----------------------------------+\n| v1 | v2 |\n+----------------------------------------------------+----------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------------------------+----------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#match_shortest_paths","title":"Match shortest paths","text":"The allShortestPaths
function can be used to find all shortest paths between two vertices.
nebula> MATCH p = allShortestPaths((a:player{name:\"Tim Duncan\"})-[e*..5]-(b:player{name:\"Tony Parker\"})) \\\n RETURN p;\n+------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})<-[:follow@0 {degree: 95}]-(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+------------------------------------------------------------------------------------------------------------------------------------+\n
The shortestPath
function can be used to find a single shortest path between two vertices.
nebula> MATCH p = shortestPath((a:player{name:\"Tim Duncan\"})-[e*..5]-(b:player{name:\"Tony Parker\"})) \\\n RETURN p;\n+------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})<-[:follow@0 {degree: 95}]-(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#retrieve_with_multiple_match","title":"Retrieve with multiple match","text":"Multiple MATCH
can be used when different patterns have different filtering criteria and return the rows that exactly match the pattern.
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n MATCH (n)-[]->(l) WHERE id(n)==\"player125\" \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/2.match/#retrieve_with_optional_match","title":"Retrieve with optional match","text":"See OPTIONAL MATCH.
Caution
In NebulaGraph, the performance and resource usage of the MATCH
statement have been optimized. But we still recommend to use GO
, LOOKUP
, |
, and FETCH
instead of MATCH
when high performance is required.
The GO
statement is used in the NebulaGraph database to traverse the graph starting from a given starting vertex with specified filters and return results.
This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#syntax","title":"Syntax","text":"GO [[<M> TO] <N> {STEP|STEPS}] FROM <vertex_list>\nOVER <edge_type_list> [{REVERSELY | BIDIRECT}]\n[ WHERE <conditions>\u00a0]\nYIELD\u00a0[DISTINCT] <return_list>\n[{SAMPLE <sample_list> | <limit_by_list_clause>}]\n[| GROUP BY {col_name | expr | position} YIELD <col_name>]\n[| ORDER BY <expression> [{ASC | DESC}]]\n[| LIMIT [<offset>,] <number_rows>];\n\n<vertex_list> ::=\n <vid> [, <vid> ...]\n\n<edge_type_list> ::=\n edge_type [, edge_type ...]\n | *\n\n<return_list> ::=\n <col_name> [AS <col_alias>] [, <col_name> [AS <col_alias>] ...]\n
<N> {STEP|STEPS}
: specifies the hop number. If not specified, the default value for N
is one
. When N
is zero
, NebulaGraph does not traverse any edges and returns nothing.
Note
The path type of the GO
statement is walk
, which means both vertices and edges can be repeatedly visited in graph traversal. For more information, see Path.
M TO N {STEP|STEPS}
: traverses from M to N
hops. When M
is zero
, the output is the same as that of M
is one
. That is, the output of GO 0 TO 2
and GO 1 TO 2
are the same.<vertex_list>
: represents a list of vertex IDs separated by commas.<edge_type_list>
: represents a list of edge types which the traversal can go through.REVERSELY | BIDIRECT
: defines the direction of the query. By default, the GO
statement searches for outgoing edges of <vertex_list>
. If REVERSELY
is set, GO
searches for incoming edges. If BIDIRECT
is set, GO
searches for edges of both directions. The direction of the query can be checked by returning the <edge_type>._type
field using YIELD
. A positive value indicates an outgoing edge, while a negative value indicates an incoming edge.WHERE <expression>
: specifies the traversal filters. You can use the WHERE
clause for the source vertices, the edges, and the destination vertices. You can use it together with AND
, OR
, NOT
, and XOR
. For more information, see WHERE.
Note
WHERE
clause when you traverse along with multiple edge types. For example, WHERE edge1.prop1 > edge2.prop2
is not supported.YIELD [DISTINCT] <return_list>
: defines the output to be returned. It is recommended to use the Schema-related functions to fill in <return_list>
. src(edge)
, dst(edge)
, type(edge) )
, rank(edge)
, etc., are currently supported, while nested functions are not. For more information, see YIELD.SAMPLE <sample_list>
: takes samples from the result set. For more information, see SAMPLE.<limit_by_list_clause>
: limits the number of outputs during the traversal process. For more information, see LIMIT.GROUP BY
: groups the output into subgroups based on the value of the specified property. For more information, see GROUP BY. After grouping, you need to use YIELD
again to define the output that needs to be returned.ORDER BY
: sorts outputs with specified orders. For more information, see ORDER BY.
Note
When the sorting method is not specified, the output orders can be different for the same query.
LIMIT [<offset>,] <number_rows>]
: limits the number of rows of the output. For more information, see LIMIT.WHERE
and YIELD
clauses in GO
statements usually utilize property reference symbols ($^
and $$
) or the properties($^)
and properties($$)
functions to specify the properties of a vertex; use the properties(edge)
function to specify the properties of an edge. For details, see Property Reference Symbols and Schema-related Functions.GO
statement, you need to set a name for the result and pass it to the next subquery using the pipe symbol |
, and reference the name of the result in the next subquery using $-
. See the Pipe Operator for details.NULL
.For example, to query the team that a person belongs to, assuming that the person is connected to the team by the serve
edge and the person's ID is player102
.
nebula>\u00a0GO FROM \"player102\" OVER serve YIELD dst(edge);\n+-----------+\n| dst(EDGE) |\n+-----------+\n| \"team203\" |\n| \"team204\" |\n+-----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_all_vertices_within_a_specified_number_of_hops_from_a_starting_vertex","title":"To query all vertices within a specified number of hops from a starting vertex","text":"For example, to query all vertices within two hops of a person vertex, assuming that the person is connected to other people by the follow
edge and the person's ID is player102
.
# Return all vertices that are 2 hops away from the player102 vertex.\nnebula> GO 2 STEPS FROM \"player102\" OVER follow YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n| \"player100\" |\n| \"player102\" |\n| \"player125\" |\n+-------------+\n
# Return all vertices within 1 or 2 hops away from the player102 vertex.\nnebula> GO 1 TO 2 STEPS FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v) -[e:follow*1..2]->(v2) \\\n WHERE id(v) == \"player100\" \\\n RETURN id(v2) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player100\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_add_filtering_conditions","title":"To add filtering conditions","text":"Case: To query the vertices and edges that meet specific conditions.
For example, use the WHERE
clause to query the edges with specific properties between the starting vertex and the destination vertex.
nebula>\u00a0GO FROM \"player100\", \"player102\" OVER serve \\\n WHERE properties(edge).start_year > 1995 \\\n YIELD DISTINCT properties($$).name AS team_name, properties(edge).start_year AS start_year, properties($^).name AS player_name;\n\n+-----------------+------------+---------------------+\n| team_name | start_year | player_name |\n+-----------------+------------+---------------------+\n| \"Spurs\" | 1997 | \"Tim Duncan\" |\n| \"Trail Blazers\" | 2006 | \"LaMarcus Aldridge\" |\n| \"Spurs\" | 2015 | \"LaMarcus Aldridge\" |\n+-----------------+------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_all_edges","title":"To query all edges","text":"Case: To query all edges that are connected to the starting vertex.
# Return all edges that are connected to the player102 vertex.\nnebula> GO FROM \"player102\" OVER * BIDIRECT YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player103\"->\"player102\" @0 {degree: 70}] |\n| [:follow \"player135\"->\"player102\" @0 {degree: 80}] |\n| [:follow \"player102\"->\"player100\" @0 {degree: 75}] |\n| [:follow \"player102\"->\"player101\" @0 {degree: 75}] |\n| [:serve \"player102\"->\"team203\" @0 {end_year: 2015, start_year: 2006}] |\n| [:serve \"player102\"->\"team204\" @0 {end_year: 2019, start_year: 2015}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_multiple_edge_types","title":"To query multiple edge types","text":"Case: To query multiple edge types that are connected to the starting vertex. You can specify multiple edge types or the *
symbol to query multiple edge types.
For example, to query the follow
and serve
edges that are connected to the starting vertex.
nebula> GO FROM \"player100\" OVER follow, serve \\\n YIELD properties(edge).degree, properties(edge).start_year;\n+-------------------------+-----------------------------+\n| properties(EDGE).degree | properties(EDGE).start_year |\n+-------------------------+-----------------------------+\n| 95 | __NULL__ |\n| 95 | __NULL__ |\n| __NULL__ | 1997 |\n+-------------------------+-----------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_query_incoming_vertices_using_the_reversely_keyword","title":"To query incoming vertices using the REVERSELY keyword","text":"# Return the vertices that follow the player100 vertex.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD src(edge) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player102\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v)<-[e:follow]- (v2) WHERE id(v) == 'player100' \\\n RETURN id(v2) AS destination;\n+-------------+\n| destination |\n+-------------+\n| \"player101\" |\n| \"player102\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_subqueries_as_the_starting_vertice_of_a_graph_traversal","title":"To use subqueries as the starting vertice of a graph traversal","text":"# Return the friends of the player100 vertex and the teams that the friends belong to.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD src(edge) AS id | \\\n GO FROM $-.id OVER serve \\\n WHERE properties($^).age > 20 \\\n YIELD properties($^).name AS FriendOf, properties($$).name AS Team;\n+---------------------+-----------------+\n| FriendOf | Team |\n+---------------------+-----------------+\n| \"Boris Diaw\" | \"Spurs\" |\n| \"Boris Diaw\" | \"Jazz\" |\n| \"Boris Diaw\" | \"Suns\" |\n...\n\n# The following MATCH query has the same semantics as the previous GO query.\nnebula> MATCH (v)<-[e:follow]- (v2)-[e2:serve]->(v3) \\\n WHERE id(v) == 'player100' \\\n RETURN v2.player.name AS FriendOf, v3.team.name AS Team;\n+---------------------+-----------------+\n| FriendOf | Team |\n+---------------------+-----------------+\n| \"Boris Diaw\" | \"Spurs\" |\n| \"Boris Diaw\" | \"Jazz\" |\n| \"Boris Diaw\" | \"Suns\" |\n...\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_group_by_to_group_the_output","title":"To use GROUP BY
to group the output","text":"You need to use YIELD
to define the output that needs to be returned after grouping.
# The following example collects the outputs according to age.\nnebula> GO 2 STEPS FROM \"player100\" OVER follow \\\n YIELD src(edge) AS src, dst(edge) AS dst, properties($$).age AS age \\\n | GROUP BY $-.dst \\\n YIELD $-.dst AS dst, collect_set($-.src) AS src, collect($-.age) AS age;\n+-------------+----------------------------+----------+\n| dst | src | age |\n+-------------+----------------------------+----------+\n| \"player125\" | {\"player101\"} | [41] |\n| \"player100\" | {\"player125\", \"player101\"} | [42, 42] |\n| \"player102\" | {\"player101\"} | [33] |\n+-------------+----------------------------+----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#to_use_order_by_and_limit_to_sort_and_limit_the_output","title":"To use ORDER BY
and LIMIT
to sort and limit the output","text":"# The following example groups the outputs and restricts the number of rows of the outputs.\nnebula> $a = GO FROM \"player100\" OVER follow YIELD src(edge) AS src, dst(edge) AS dst; \\\n GO 2 STEPS FROM $a.dst OVER follow \\\n YIELD $a.src AS src, $a.dst, src(edge), dst(edge) \\\n | ORDER BY $-.src | OFFSET 1 LIMIT 2;\n+-------------+-------------+-------------+-------------+\n| src | $a.dst | src(EDGE) | dst(EDGE) |\n+-------------+-------------+-------------+-------------+\n| \"player100\" | \"player101\" | \"player100\" | \"player101\" |\n| \"player100\" | \"player125\" | \"player100\" | \"player125\" |\n+-------------+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/3.go/#other_examples","title":"Other examples","text":"# The following example determines if $$.player.name IS NOT EMPTY.\nnebula> GO FROM \"player100\" OVER follow WHERE properties($$).name IS NOT EMPTY YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player125\" |\n| \"player101\" |\n+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/","title":"FETCH","text":"The FETCH
statement retrieves the properties of the specified vertices or edges.
This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties","title":"Fetch vertex properties","text":""},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#syntax","title":"Syntax","text":"FETCH PROP ON {<tag_name>[, tag_name ...] | *}\n<vid> [, vid ...]\nYIELD [DISTINCT] <return_list> [AS <alias>];\n
Parameter Description tag_name
The name of the tag. *
Represents all the tags in the current graph space. vid
The vertex ID. YIELD
Define the output to be returned. For details, see YIELD
. AS
Set an alias."},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_one_tag","title":"Fetch vertex properties by one tag","text":"Specify a tag in the FETCH
statement to fetch the vertex properties by that tag.
nebula> FETCH PROP ON player \"player100\" YIELD properties(vertex);\n+-------------------------------+\n| properties(VERTEX) |\n+-------------------------------+\n| {age: 42, name: \"Tim Duncan\"} |\n+-------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_specific_properties_of_a_vertex","title":"Fetch specific properties of a vertex","text":"Use a YIELD
clause to specify the properties to be returned.
nebula> FETCH PROP ON player \"player100\" \\\n YIELD properties(vertex).name AS name;\n+--------------+\n| name |\n+--------------+\n| \"Tim Duncan\" |\n+--------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_of_multiple_vertices","title":"Fetch properties of multiple vertices","text":"Specify multiple VIDs (vertex IDs) to fetch properties of multiple vertices. Separate the VIDs with commas.
nebula> FETCH PROP ON player \"player101\", \"player102\", \"player103\" YIELD properties(vertex);\n+--------------------------------------+\n| properties(VERTEX) |\n+--------------------------------------+\n| {age: 33, name: \"LaMarcus Aldridge\"} |\n| {age: 36, name: \"Tony Parker\"} |\n| {age: 32, name: \"Rudy Gay\"} |\n+--------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_multiple_tags","title":"Fetch vertex properties by multiple tags","text":"Specify multiple tags in the FETCH
statement to fetch the vertex properties by the tags. Separate the tags with commas.
# The following example creates a new tag t1.\nnebula> CREATE TAG IF NOT EXISTS t1(a string, b int);\n\n# The following example attaches t1 to the vertex \"player100\".\nnebula> INSERT VERTEX t1(a, b) VALUES \"player100\":(\"Hello\", 100);\n\n# The following example fetches the properties of vertex \"player100\" by the tags player and t1.\nnebula> FETCH PROP ON player, t1 \"player100\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n+----------------------------------------------------------------------------+\n
You can combine multiple tags with multiple VIDs in a FETCH
statement.
nebula> FETCH PROP ON player, t1 \"player100\", \"player103\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n| (\"player103\" :player{age: 32, name: \"Rudy Gay\"}) |\n+----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_vertex_properties_by_all_tags","title":"Fetch vertex properties by all tags","text":"Set an asterisk symbol *
to fetch properties by all tags in the current graph space.
nebula> FETCH PROP ON * \"player100\", \"player106\", \"team200\" YIELD vertex AS v;\n+----------------------------------------------------------------------------+\n| v |\n+----------------------------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"} :t1{a: \"Hello\", b: 100}) |\n| (\"player106\" :player{age: 25, name: \"Kyle Anderson\"}) |\n| (\"team200\" :team{name: \"Warriors\"}) |\n+----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_edge_properties","title":"Fetch edge properties","text":""},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#syntax_1","title":"Syntax","text":"FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]\nYIELD <output>;\n
Parameter Description edge_type
The name of the edge type. src_vid
The VID of the source vertex. It specifies the start of an edge. dst_vid
The VID of the destination vertex. It specifies the end of an edge. rank
The rank of the edge. It is optional and defaults to 0
. It distinguishes an edge from other edges with the same edge type, source vertex, destination vertex, and rank. YIELD
Define the output to be returned. For details, see YIELD
."},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_all_properties_of_an_edge","title":"Fetch all properties of an edge","text":"The following statement fetches all the properties of the serve
edge that connects vertex \"player100\"
and vertex \"team204\"
.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\" YIELD properties(edge);\n+------------------------------------+\n| properties(EDGE) |\n+------------------------------------+\n| {end_year: 2016, start_year: 1997} |\n+------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_specific_properties_of_an_edge","title":"Fetch specific properties of an edge","text":"Use a YIELD
clause to fetch specific properties of an edge.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\" \\\n YIELD properties(edge).start_year;\n+-----------------------------+\n| properties(EDGE).start_year |\n+-----------------------------+\n| 1997 |\n+-----------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_of_multiple_edges","title":"Fetch properties of multiple edges","text":"Specify multiple edge patterns (<src_vid> -> <dst_vid>[@<rank>]
) to fetch properties of multiple edges. Separate the edge patterns with commas.
nebula> FETCH PROP ON serve \"player100\" -> \"team204\", \"player133\" -> \"team202\" YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n| [:serve \"player133\"->\"team202\" @0 {end_year: 2011, start_year: 2002}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#fetch_properties_based_on_edge_rank","title":"Fetch properties based on edge rank","text":"If there are multiple edges with the same edge type, source vertex, and destination vertex, you can specify the rank to fetch the properties on the correct edge.
# The following example inserts edges with different ranks and property values.\nnebula> insert edge serve(start_year,end_year) \\\n values \"player100\"->\"team204\"@1:(1998, 2017);\n\nnebula> insert edge serve(start_year,end_year) \\\n values \"player100\"->\"team204\"@2:(1990, 2018);\n\n# By default, the FETCH statement returns the edge whose rank is 0.\nnebula> FETCH PROP ON serve \"player100\" -> \"team204\" YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] |\n+-----------------------------------------------------------------------+\n\n# To fetch on an edge whose rank is not 0, set its rank in the FETCH statement.\nnebula> FETCH PROP ON serve \"player100\" -> \"team204\"@1 YIELD edge AS e;\n+-----------------------------------------------------------------------+\n| e |\n+-----------------------------------------------------------------------+\n| [:serve \"player100\"->\"team204\" @1 {end_year: 2017, start_year: 1998}] |\n+-----------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/4.fetch/#use_fetch_in_composite_queries","title":"Use FETCH in composite queries","text":"A common way to use FETCH
is to combine it with native nGQL such as GO
.
The following statement returns the degree
values of the follow
edges that start from vertex \"player101\"
.
nebula> GO FROM \"player101\" OVER follow \\\n YIELD src(edge) AS s, dst(edge) AS d \\\n | FETCH PROP ON follow $-.s -> $-.d \\\n YIELD properties(edge).degree;\n+-------------------------+\n| properties(EDGE).degree |\n+-------------------------+\n| 95 |\n| 90 |\n| 95 |\n+-------------------------+\n
Or you can use user-defined variables to construct similar queries.
nebula> $var = GO FROM \"player101\" OVER follow \\\n YIELD src(edge) AS s, dst(edge) AS d; \\\n FETCH PROP ON follow $var.s -> $var.d \\\n YIELD properties(edge).degree;\n+-------------------------+\n| properties(EDGE).degree |\n+-------------------------+\n| 95 |\n| 90 |\n| 95 |\n+-------------------------+\n
For more information about composite queries, see Composite queries (clause structure).
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/","title":"LOOKUP","text":"The LOOKUP
statement traverses data based on indexes. You can use LOOKUP
for the following purposes:
WHERE
clause.This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#precautions","title":"Precautions","text":"If the specified property is not indexed when using the LOOKUP
statement, NebulaGraph randomly selects one of the available indexes.
For example, the tag player
has two properties, name
and age
. Both the tag player
itself and the property name
have indexes, but the property age
has no indexes. When running LOOKUP ON player WHERE player.age == 36 YIELD player.name;
, NebulaGraph randomly uses one of the indexes of the tag player
and the property name
. You can use the EXPLAIN
statement to check the selected index.
Legacy version compatibility
Before the release 2.5.0, if the specified property is not indexed when using the LOOKUP
statement, NebulaGraph reports an error and does not use other indexes.
Before using the LOOKUP
statement, make sure that at least one index is created. If there are already related vertices, edges, or properties before an index is created, the user must rebuild the index after creating the index to make it valid.
LOOKUP ON {<vertex_tag> | <edge_type>}\n[WHERE <expression> [AND <expression> ...]]\nYIELD [DISTINCT] <return_list> [AS <alias>];\n\n<return_list>\n <prop_name> [AS <col_alias>] [, <prop_name> [AS <prop_alias>] ...];\n
WHERE <expression>
: filters data with specified conditions. Both AND
and OR
are supported between different expressions. For more information, see WHERE.YIELD
: Define the output to be returned. For details, see YIELD
.DISTINCT
: Aggregate the output results and return the de-duplicated result set.AS
: Set an alias.WHERE
in LOOKUP
","text":"The WHERE
clause in a LOOKUP
statement does not support the following operations:
$-
and $^
.rank()
.tagName.prop1> tagName.prop2
.XOR
operation is not supported.STARTS WITH
are not supported.The following example returns vertices whose name
is Tony Parker
and the tag is player
.
nebula> CREATE TAG INDEX IF NOT EXISTS index_player ON player(name(30), age);\n\nnebula> REBUILD TAG INDEX index_player;\n+------------+\n| New Job Id |\n+------------+\n| 15 |\n+------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Tony Parker\" \\\n YIELD id(vertex);\n+---------------+\n| id(VERTEX) |\n+---------------+\n| \"player101\" |\n+---------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name AS name, properties(vertex).age AS age;\n+---------------+-----+\n| name | age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n\nnebula> LOOKUP ON player \\\n WHERE player.age > 45 \\\n YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player144\" |\n| \"player140\" |\n+-------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name STARTS WITH \"B\" \\\n AND player.age IN [22,30] \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Ben Simmons\" | 22 |\n| \"Blake Griffin\" | 30 |\n+-------------------------+------------------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.name == \"Kobe Bryant\"\\\n YIELD id(vertex) AS VertexID, properties(vertex).name AS name |\\\n GO FROM $-.VertexID OVER serve \\\n YIELD $-.name, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+---------------+-----------------------------+---------------------------+---------------------+\n| $-.name | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+---------------+-----------------------------+---------------------------+---------------------+\n| \"Kobe Bryant\" | 1996 | 2016 | \"Lakers\" |\n+---------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#retrieve_edges","title":"Retrieve edges","text":"The following example returns edges whose degree
is 90
and the edge type is follow
.
nebula> CREATE EDGE INDEX IF NOT EXISTS index_follow ON follow(degree);\n\nnebula> REBUILD EDGE INDEX index_follow;\n+------------+\n| New Job Id |\n+------------+\n| 62 |\n+------------+\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 90 YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player109\"->\"player125\" @0 {degree: 90}] |\n| [:follow \"player118\"->\"player120\" @0 {degree: 90}] |\n| [:follow \"player118\"->\"player131\" @0 {degree: 90}] |\n...\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 90 \\\n YIELD properties(edge).degree;\n+-------------+-------------+---------+-------------------------+\n| SrcVID | DstVID | Ranking | properties(EDGE).degree |\n+-------------+-------------+---------+-------------------------+\n| \"player150\" | \"player143\" | 0 | 90 |\n| \"player150\" | \"player137\" | 0 | 90 |\n| \"player148\" | \"player136\" | 0 | 90 |\n...\n\nnebula> LOOKUP ON follow \\\n WHERE follow.degree == 60 \\\n YIELD dst(edge) AS DstVID, properties(edge).degree AS Degree |\\\n GO FROM $-.DstVID OVER serve \\\n YIELD $-.DstVID, properties(edge).start_year, properties(edge).end_year, properties($$).name;\n+-------------+-----------------------------+---------------------------+---------------------+\n| $-.DstVID | properties(EDGE).start_year | properties(EDGE).end_year | properties($$).name |\n+-------------+-----------------------------+---------------------------+---------------------+\n| \"player105\" | 2010 | 2018 | \"Spurs\" |\n| \"player105\" | 2009 | 2010 | \"Cavaliers\" |\n| \"player105\" | 2018 | 2019 | \"Raptors\" |\n+-------------+-----------------------------+---------------------------+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/5.lookup/#list_vertices_or_edges_with_a_tag_or_an_edge_type","title":"List vertices or edges with a tag or an edge type","text":"To list vertices or edges with a tag or an edge type, at least one index must exist on the tag, the edge type, or its property.
For example, if there is a player
tag with a name
property and an age
property, to retrieve the VID of all vertices tagged with player
, there has to be an index on the player
tag itself, the name
property, or the age
property.
player
.nebula> CREATE TAG IF NOT EXISTS player(name string,age int);\n\nnebula> CREATE TAG INDEX IF NOT EXISTS player_index on player();\n\nnebula> REBUILD TAG INDEX player_index;\n+------------+\n| New Job Id |\n+------------+\n| 66 |\n+------------+\n\nnebula> INSERT VERTEX player(name,age) \\\n VALUES \"player100\":(\"Tim Duncan\", 42), \"player101\":(\"Tony Parker\", 36);\n\nThe following statement retrieves the VID of all vertices with the tag `player`. It is similar to `MATCH (n:player) RETURN id(n) /*, n */`.\n\nnebula> LOOKUP ON player YIELD id(vertex);\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n...\n
follow
edge type.nebula> CREATE EDGE IF NOT EXISTS follow(degree int);\n\nnebula> CREATE EDGE INDEX IF NOT EXISTS follow_index on follow();\n\nnebula> REBUILD EDGE INDEX follow_index;\n+------------+\n| New Job Id |\n+------------+\n| 88 |\n+------------+\n\nnebula> INSERT EDGE follow(degree) \\\n VALUES \"player100\"->\"player101\":(95);\n\nThe following statement retrieves all edges with the edge type `follow`. It is similar to `MATCH (s)-[e:follow]->(d) RETURN id(s), rank(e), id(d) /*, type(e) */`.\n\nnebula)> LOOKUP ON follow YIELD edge AS e;\n+-----------------------------------------------------+\n| e |\n+-----------------------------------------------------+\n| [:follow \"player105\"->\"player100\" @0 {degree: 70}] |\n| [:follow \"player105\"->\"player116\" @0 {degree: 80}] |\n| [:follow \"player109\"->\"player100\" @0 {degree: 80}] |\n...\n
The following example shows how to count the number of vertices tagged with player
and edges of the follow
edge type.
nebula> LOOKUP ON player YIELD id(vertex)|\\\n YIELD COUNT(*) AS Player_Number;\n+---------------+\n| Player_Number |\n+---------------+\n| 51 |\n+---------------+\n\nnebula> LOOKUP ON follow YIELD edge AS e| \\\n YIELD COUNT(*) AS Follow_Number;\n+---------------+\n| Follow_Number |\n+---------------+\n| 81 |\n+---------------+\n
Note
You can also use SHOW STATS
to count the numbers of vertices or edges.
The FIND PATH
statement finds the paths between the selected source vertices and destination vertices.
Note
To improve the query performance with the FIND PATH
statement, you can add the num_operator_threads
parameter in the nebula-graphd.conf
configuration file. The value range of the num_operator_threads
parameter is [2, 10] and make sure that the value is not greater than the number of CPU cores of the machine where the graphd
service is deployed. It is recommended to set the value to the number of CPU cores of the machine where the graphd
service is deployed. For more information about the nebula-graphd.conf
configuration file, see nebula-graphd.conf.
FIND { SHORTEST | SINGLE SHORTEST | ALL | NOLOOP } PATH [WITH PROP] FROM <vertex_id_list> TO <vertex_id_list>\nOVER <edge_type_list> [REVERSELY | BIDIRECT] \n[<WHERE clause>] [UPTO <N> {STEP|STEPS}] \nYIELD path as <alias>\n[| ORDER BY $-.path] [| LIMIT <M>];\n\n<vertex_id_list> ::=\n [vertex_id [, vertex_id] ...]\n
SHORTEST
finds all the shortest path.ALL
finds all the paths.NOLOOP
finds the paths without circles.WITH PROP
shows properties of vertices and edges. If not specified, properties will be hidden.<vertex_id_list>
is a list of vertex IDs separated with commas (,). It supports $-
and $var
.<edge_type_list>
is a list of edge types separated with commas (,). *
is all edge types.REVERSELY | BIDIRECT
specifies the direction. REVERSELY
is reverse graph traversal while BIDIRECT
is bidirectional graph traversal.<WHERE clause>
filters properties of edges.UPTO <N> {STEP|STEPS}
is the maximum hop number of the path. The default value is 5
.ORDER BY $-.path
specifies the order of the returned paths. For information about the order rules, see Path.LIMIT <M>
specifies the maximum number of rows to return.Note
The path type of FIND PATH
is trail
. Only vertices can be repeatedly visited in graph traversal. For more information, see Path.
FIND PATH
only supports filtering properties of edges with WHERE
clauses. Filtering properties of vertices and functions are not supported for now.FIND PATH
is a single-thread procedure, so it uses much memory.A returned path is like (<vertex_id>)-[:<edge_type_name>@<rank>]->(<vertex_id)
.
nebula> FIND SHORTEST PATH FROM \"player102\" TO \"team204\" OVER * YIELD path AS p;\n+--------------------------------------------+\n| p |\n+--------------------------------------------+\n| <(\"player102\")-[:serve@0 {}]->(\"team204\")> |\n+--------------------------------------------+\n
nebula> FIND SHORTEST PATH WITH PROP FROM \"team204\" TO \"player100\" OVER * REVERSELY YIELD path AS p;\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"team204\" :team{name: \"Spurs\"})<-[:serve@0 {end_year: 2016, start_year: 1997}]-(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> |\n+--------------------------------------------------------------------------------------------------------------------------------------+\n
nebula> FIND SHORTEST PATH FROM \"player100\", \"player130\" TO \"player132\", \"player133\" OVER * BIDIRECT UPTO 18 STEPS YIELD path as p;\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\")<-[:follow@0 {}]-(\"player144\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player112\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player109\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player104\")-[:serve@20182019 {}]->(\"team204\")<-[:serve@0 {}]-(\"player114\")<-[:follow@0 {}]-(\"player133\")> |\n| ... |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player112\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| <(\"player130\")-[:serve@0 {}]->(\"team219\")<-[:serve@0 {}]-(\"player109\")-[:serve@0 {}]->(\"team204\")<-[:serve@0 {}]-(\"player138\")-[:serve@0 {}]->(\"team225\")<-[:serve@0 {}]-(\"player132\")> |\n| ... |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
nebula> FIND ALL PATH FROM \"player100\" TO \"team204\" OVER * WHERE follow.degree is EMPTY or follow.degree >=0 YIELD path AS p;\n+------------------------------------------------------------------------------+\n| p |\n+------------------------------------------------------------------------------+\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:serve@0 {}]->(\"team204\")> |\n|... |\n+------------------------------------------------------------------------------+\n
nebula> FIND NOLOOP PATH FROM \"player100\" TO \"team204\" OVER * YIELD path AS p;\n+--------------------------------------------------------------------------------------------------------+\n| p |\n+--------------------------------------------------------------------------------------------------------+\n| <(\"player100\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")-[:serve@0 {}]->(\"team204\")> |\n| <(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player102\")-[:serve@0 {}]->(\"team204\")> |\n+--------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.find-path/#faq","title":"FAQ","text":""},{"location":"3.ngql-guide/7.general-query-statements/6.find-path/#does_it_support_the_where_clause_to_achieve_conditional_filtering_during_graph_traversal","title":"Does it support the WHERE clause to achieve conditional filtering during graph traversal?","text":"FIND PATH
only supports filtering properties of edges with WHERE
clauses, such as WHERE follow.degree is EMPTY or follow.degree >=0
.
Filtering properties of vertices is not supported for now.
"},{"location":"3.ngql-guide/7.general-query-statements/7.get-subgraph/","title":"GET SUBGRAPH","text":"The GET SUBGRAPH
statement returns a subgraph that is generated by traversing a graph starting from a specified vertex. GET SUBGRAPH
statements allow you to specify the number of steps and the type or direction of edges during the traversal.
GET SUBGRAPH [WITH PROP] [<step_count> {STEP|STEPS}] FROM {<vid>, <vid>...}\n[{IN | OUT | BOTH} <edge_type>, <edge_type>...]\n[WHERE <expression> [AND <expression> ...]]\nYIELD {[VERTICES AS <vertex_alias>] [,EDGES AS <edge_alias>]};\n
WITH PROP
shows the properties. If not specified, the properties will be hidden.step_count
specifies the number of hops from the source vertices and returns the subgraph from 0 to step_count
hops. It must be a non-negative integer. Its default value is 1.vid
specifies the vertex IDs. edge_type
specifies the edge type. You can use IN
, OUT
, and BOTH
to specify the traversal direction of the edge type. The default is BOTH
.<WHERE clause>
specifies the filter conditions for the traversal, which can be used with the boolean operator AND
.YIELD
defines the output that needs to be returned. You can return only vertices or edges. A column alias must be set.Note
The path type of GET SUBGRAPH
is trail
. Only vertices can be repeatedly visited in graph traversal. For more information, see Path.
While using the WHERE
clause in a GET SUBGRAPH
statement, note the following restrictions:
AND
operator.$$.tagName.propName
.edge_type.propName
.The following graph is used as the sample.
Insert the test data:
nebula> CREATE SPACE IF NOT EXISTS subgraph(partition_num=15, replica_factor=1, vid_type=fixed_string(30));\nnebula> USE subgraph;\nnebula> CREATE TAG IF NOT EXISTS player(name string, age int);\nnebula> CREATE TAG IF NOT EXISTS team(name string);\nnebula> CREATE EDGE IF NOT EXISTS follow(degree int);\nnebula> CREATE EDGE IF NOT EXISTS serve(start_year int, end_year int);\nnebula> INSERT VERTEX player(name, age) VALUES \"player100\":(\"Tim Duncan\", 42);\nnebula> INSERT VERTEX player(name, age) VALUES \"player101\":(\"Tony Parker\", 36);\nnebula> INSERT VERTEX player(name, age) VALUES \"player102\":(\"LaMarcus Aldridge\", 33);\nnebula> INSERT VERTEX team(name) VALUES \"team203\":(\"Trail Blazers\"), \"team204\":(\"Spurs\");\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player100\":(95);\nnebula> INSERT EDGE follow(degree) VALUES \"player101\" -> \"player102\":(90);\nnebula> INSERT EDGE follow(degree) VALUES \"player102\" -> \"player100\":(75);\nnebula> INSERT EDGE serve(start_year, end_year) VALUES \"player101\" -> \"team204\":(1999, 2018),\"player102\" -> \"team203\":(2006, 2015);\n
player101
over all edge types and gets the subgraph.nebula> GET SUBGRAPH 1 STEPS FROM \"player101\" YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n| [(\"player101\" :player{})] | [[:serve \"player101\"->\"team204\" @0 {}], [:follow \"player101\"->\"player100\" @0 {}], [:follow \"player101\"->\"player102\" @0 {}]] |\n| [(\"team204\" :team{}), (\"player100\" :player{}), (\"player102\" :player{})] | [[:follow \"player102\"->\"player100\" @0 {}]] |\n+-------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n
The returned subgraph is as follows.
player101
over incoming follow
edges and gets the subgraph.nebula> GET SUBGRAPH 1 STEPS FROM \"player101\" IN follow YIELD VERTICES AS nodes, EDGES AS relationships;\n+---------------------------+---------------+\n| nodes | relationships |\n+---------------------------+---------------+\n| [(\"player101\" :player{})] | [] |\n+---------------------------+---------------+\n
There is no incoming follow
edge to player101
, so only the vertex player101
is returned.
player101
over outgoing serve
edges, gets the subgraph, and shows the property of the edge.nebula> GET SUBGRAPH WITH PROP 1 STEPS FROM \"player101\" OUT serve YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n| [(\"player101\" :player{age: 36, name: \"Tony Parker\"})] | [[:serve \"player101\"->\"team204\" @0 {end_year: 2018, start_year: 1999}]] |\n| [(\"team204\" :team{name: \"Spurs\"})] | [] |\n+-------------------------------------------------------+-------------------------------------------------------------------------+\n
The returned subgraph is as follows.
player101
over follow
edges, filters by degree > 90 and age > 30, and shows the properties of edges.nebula> GET SUBGRAPH WITH PROP 2 STEPS FROM \"player101\" \\\n WHERE follow.degree > 90 AND $$.player.age > 30 \\\n YIELD VERTICES AS nodes, EDGES AS relationships;\n+-------------------------------------------------------+------------------------------------------------------+\n| nodes | relationships |\n+-------------------------------------------------------+------------------------------------------------------+\n| [(\"player101\" :player{age: 36, name: \"Tony Parker\"})] | [[:follow \"player101\"->\"player100\" @0 {degree: 95}]] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"})] | [] |\n+-------------------------------------------------------+------------------------------------------------------+\n
step_count
?","text":"To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions. The following graph is used as the sample.
GET SUBGRAPH 1 STEPS FROM \"A\";
are A->B
, B->A
, and A->C
. To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions, namely B->C
.GET SUBGRAPH 1 STEPS FROM \"A\" IN follow;
is B->A
. To show the completeness of the subgraph, an additional hop is made on all vertices that meet the conditions, namely A->B
.If you only query paths or vertices that meet the conditions, we suggest you use MATCH or GO. The example is as follows.
nebula> MATCH p= (v:player) -- (v2) WHERE id(v)==\"A\" RETURN p;\nnebula> GO 1 STEPS FROM \"A\" OVER follow YIELD src(edge),dst(edge);\n
"},{"location":"3.ngql-guide/7.general-query-statements/7.get-subgraph/#why_is_the_number_of_hops_in_the_returned_result_lower_than_step_count","title":"Why is the number of hops in the returned result lower than step_count
?","text":"The query stops when there is not enough subgraph data and will not return the null value.
nebula> GET SUBGRAPH 100 STEPS FROM \"player101\" OUT follow YIELD VERTICES AS nodes, EDGES AS relationships;\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n| nodes | relationships |\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n| [(\"player101\" :player{})] | [[:follow \"player101\"->\"player100\" @0 {}], [:follow \"player101\"->\"player102\" @0 {}]] |\n| [(\"player100\" :player{}), (\"player102\" :player{})] | [[:follow \"player102\"->\"player100\" @0 {}]] |\n+----------------------------------------------------+--------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/optional-match/","title":"OPTIONAL MATCH","text":"Caution
The feature is still in beta. It will continue to be optimized.
The OPTIONAL MATCH
clause is used to search for the pattern described in it. OPTIONAL MATCH
matches patterns against your graph database, just like MATCH
does. The difference is that if no matches are found, OPTIONAL MATCH
will use a null for missing parts of the pattern.
This topic applies to the openCypher syntax in nGQL only.
"},{"location":"3.ngql-guide/7.general-query-statements/optional-match/#limitations","title":"Limitations","text":"The WHERE
clause cannot be used in an OPTIONAL MATCH
clause.
The example of the use of OPTIONAL MATCH
in the MATCH
statement is as follows:
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n OPTIONAL MATCH (n)-[]->(l) \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"team204\" | __NULL__ |\n| \"player100\" | \"player101\" | \"team204\" |\n| \"player100\" | \"player101\" | \"team215\" |\n| \"player100\" | \"player101\" | \"player100\" |\n| \"player100\" | \"player101\" | \"player102\" |\n| \"player100\" | \"player101\" | \"player125\" |\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
Using multiple MATCH
instead of OPTIONAL MATCH
returns rows that match the pattern exactly. The example is as follows:
nebula> MATCH (m)-[]->(n) WHERE id(m)==\"player100\" \\\n MATCH (n)-[]->(l) \\\n RETURN id(m),id(n),id(l);\n+-------------+-------------+-------------+\n| id(m) | id(n) | id(l) |\n+-------------+-------------+-------------+\n| \"player100\" | \"player101\" | \"team204\" |\n| \"player100\" | \"player101\" | \"team215\" |\n| \"player100\" | \"player101\" | \"player100\" |\n| \"player100\" | \"player101\" | \"player102\" |\n| \"player100\" | \"player101\" | \"player125\" |\n| \"player100\" | \"player125\" | \"team204\" |\n| \"player100\" | \"player125\" | \"player100\" |\n+-------------+-------------+-------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/1.show-charset/","title":"SHOW CHARSET","text":"The SHOW CHARSET
statement shows the available character sets.
Currently available types are utf8
and utf8mb4
. The default charset type is utf8
. NebulaGraph extends the uft8
to support four-byte characters. Therefore utf8
and utf8mb4
are equivalent.
SHOW CHARSET;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/1.show-charset/#example","title":"Example","text":"nebula> SHOW CHARSET;\n+---------+-----------------+-------------------+--------+\n| Charset | Description | Default collation | Maxlen |\n+---------+-----------------+-------------------+--------+\n| \"utf8\" | \"UTF-8 Unicode\" | \"utf8_bin\" | 4 |\n+---------+-----------------+-------------------+--------+\n
Parameter Description Charset
The name of the character set. Description
The description of the character set. Default collation
The default collation of the character set. Maxlen
The maximum number of bytes required to store one character."},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/","title":"SHOW ROLES","text":"The SHOW ROLES
statement shows the roles that are assigned to a user account.
The return message differs according to the role of the user who is running this statement:
GOD
or ADMIN
and is granted access to the specified graph space, NebulaGraph shows all roles in this graph space except for GOD
.DBA
, USER
, or GUEST
and is granted access to the specified graph space, NebulaGraph shows the user's own role in this graph space.PermissionError
.For more information about roles, see Roles and privileges.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/#syntax","title":"Syntax","text":"SHOW ROLES IN <space_name>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/10.show-roles/#example","title":"Example","text":"nebula> SHOW ROLES in basketballplayer;\n+---------+-----------+\n| Account | Role Type |\n+---------+-----------+\n| \"user1\" | \"ADMIN\" |\n+---------+-----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/","title":"SHOW SNAPSHOTS","text":"The SHOW SNAPSHOTS
statement shows the information of all the snapshots.
For how to create a snapshot and backup data, see Snapshot.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/#role_requirement","title":"Role requirement","text":"Only the root
user who has the GOD
role can use the SHOW SNAPSHOTS
statement.
SHOW SNAPSHOTS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/11.show-snapshots/#example","title":"Example","text":"nebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+-----------------------------------------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+-----------------------------------------------------+\n| \"SNAPSHOT_2020_12_16_11_13_55\" | \"VALID\" | \"storaged0:9779, storaged1:9779, storaged2:9779\" |\n| \"SNAPSHOT_2020_12_16_11_14_10\" | \"VALID\" | \"storaged0:9779, storaged1:9779, storaged2:9779\" |\n+--------------------------------+---------+-----------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/","title":"SHOW SPACES","text":"The SHOW SPACES
statement shows existing graph spaces in NebulaGraph.
For how to create a graph space, see CREATE SPACE.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/#syntax","title":"Syntax","text":"SHOW SPACES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/12.show-spaces/#example","title":"Example","text":"nebula> SHOW SPACES;\n+---------------------+\n| Name |\n+---------------------+\n| \"docs\" |\n| \"basketballplayer\" |\n+---------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/14.show-stats/","title":"SHOW STATS","text":"The SHOW STATS
statement shows the statistics of the graph space collected by the latest SUBMIT JOB STATS
job.
The statistics include the following information:
Warning
The data returned by SHOW STATS
is not real-time. The returned data is collected by the latest SUBMIT JOB STATS job and may include TTL-expired data. The expired data will be deleted and not included in the statistics the next time the Compaction operation is performed.
You have to run the SUBMIT JOB STATS
statement in the graph space where you want to collect statistics. For more information, see SUBMIT JOB STATS.
Caution
The result of the SHOW STATS
statement is based on the last executed SUBMIT JOB STATS
statement. If you want to update the result, run SUBMIT JOB STATS
again. Otherwise the statistics will be wrong.
SHOW STATS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/14.show-stats/#examples","title":"Examples","text":"# Choose a graph space.\nnebula> USE basketballplayer;\n\n# Start SUBMIT JOB STATS.\nnebula> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 98 |\n+------------+\n\n# Make sure the job executes successfully.\nnebula> SHOW JOB 98;\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n| Job Id(TaskId) | Command(Dest) | Status | Start Time | Stop Time | Error Code |\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n| 98 | \"STATS\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 0 | \"storaged2\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 1 | \"storaged0\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| 2 | \"storaged1\" | \"FINISHED\" | 2021-11-01T09:33:21.000000 | 2021-11-01T09:33:21.000000 | \"SUCCEEDED\" |\n| \"Total:3\" | \"Succeeded:3\" | \"Failed:0\" | \"In Progress:0\" | \"\" | \"\" |\n+----------------+---------------+------------+----------------------------+----------------------------+-------------+\n\n# Show the statistics of the graph space.\nnebula> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 51 |\n| \"Tag\" | \"team\" | 30 |\n| \"Edge\" | \"follow\" | 81 |\n| \"Edge\" | \"serve\" | 152 |\n| \"Space\" | \"vertices\" | 81 |\n| \"Space\" | \"edges\" | 233 |\n+---------+------------+-------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/15.show-tags-edges/","title":"SHOW TAGS/EDGES","text":"The SHOW TAGS
statement shows all the tags in the current graph space.
The SHOW EDGES
statement shows all the edge types in the current graph space.
SHOW {TAGS | EDGES};\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/15.show-tags-edges/#examples","title":"Examples","text":"nebula> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"star\" |\n| \"team\" |\n+----------+\n\nnebula> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/16.show-users/","title":"SHOW USERS","text":"The SHOW USERS
statement shows the user information.
Only the root
user who has the GOD
role can use the SHOW USERS
statement.
SHOW USERS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/16.show-users/#example","title":"Example","text":"nebula> SHOW USERS;\n+---------+-----------------+\n| Account | IP Whitelist |\n+---------+-----------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n| \"user2\" | \"192.168.10.10\" |\n+---------+-----------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/17.show-sessions/","title":"SHOW SESSIONS","text":"When a user logs in to the database, a corresponding session will be created and users can query for session information.
The SHOW SESSIONS
statement shows the information of all the sessions. It can also show a specified session with its ID.
release
to release the session and clear the session information when you run exit
after the operation ends. If you exit the database in an unexpected way and the session timeout duration is not set via session_idle_timeout_secs
in nebula-graphd.conf, the session will not be released automatically. For those sessions that are not automatically released, you need to delete them manually. For details, see KILL SESSIONS.SHOW SESSIONS
queries the session information of all the Graph services.SHOW LOCAL SESSIONS
queries the session information of the currently connected Graph service and does not query the session information of other Graph services.SHOW SESSION <Session_Id>
queries the session information with a specific session id.SHOW [LOCAL] SESSIONS;\nSHOW SESSION <Session_Id>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/17.show-sessions/#examples","title":"Examples","text":"nebula> SHOW SESSIONS;\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| SessionId | UserName | SpaceName | CreateTime | UpdateTime | GraphAddr | Timezone | ClientIp |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| 1651220858102296 | \"root\" | \"basketballplayer\" | 2022-04-29T08:27:38.102296 | 2022-04-29T08:50:46.282921 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651199330300991 | \"root\" | \"basketballplayer\" | 2022-04-29T02:28:50.300991 | 2022-04-29T08:16:28.339038 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651112899847744 | \"root\" | \"basketballplayer\" | 2022-04-28T02:28:19.847744 | 2022-04-28T08:17:44.470210 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1651041092662100 | \"root\" | \"basketballplayer\" | 2022-04-27T06:31:32.662100 | 2022-04-27T07:01:25.200978 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1650959429593975 | \"root\" | \"basketballplayer\" | 2022-04-26T07:50:29.593975 | 2022-04-26T07:51:47.184810 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n| 1650958897679595 | \"root\" | \"\" | 2022-04-26T07:41:37.679595 | 2022-04-26T07:41:37.683802 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n\nnebula> SHOW SESSION 1635254859271703;\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| SessionId | UserName | SpaceName | CreateTime | UpdateTime | GraphAddr | Timezone | ClientIp |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n| 1651220858102296 | \"root\" | \"basketballplayer\" | 2022-04-29T08:27:38.102296 | 2022-04-29T08:50:54.254384 | \"127.0.0.1:9669\" | 0 | \"127.0.0.1\" |\n+------------------+----------+--------------------+----------------------------+----------------------------+------------------+----------+--------------------+\n
Parameter Description SessionId
The session ID, namely the identifier of a session. UserName
The username in a session. SpaceName
The name of the graph space that the user uses currently. It is null (\"\"
) when you first log in because there is no specified graph space. CreateTime
The time when the session is created, namely the time when the user logs in. The time zone is specified by timezone_name
in the configuration file. UpdateTime
The system will update the time when there is an operation. The time zone is specified by timezone_name
in the configuration file. GraphAddr
The IP (or hostname) and port of the Graph server that hosts the session. Timezone
A reserved parameter that has no specified meaning for now. ClientIp
The IP or hostname of the client."},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/","title":"SHOW QUERIES","text":"The SHOW QUERIES
statement shows the information of working queries in the current session.
Note
To terminate queries, see Kill Query.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/#precautions","title":"Precautions","text":"SHOW LOCAL QUERIES
statement gets the status of queries in the current session from the local cache with almost no latency.SHOW QUERIES
statement gets the information of queries in all the sessions from the Meta Service. The information will be synchronized to the Meta Service according to the interval defined by session_reclaim_interval_secs
. Therefore the information that you get from the client may belong to the last synchronization interval.SHOW [LOCAL] QUERIES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/18.show-queries/#examples","title":"Examples","text":"nebula> SHOW LOCAL QUERIES;\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n| 1625463842921750 | 46 | \"root\" | \"\"192.168.x.x\":9669\" | 2021-07-05T05:44:19.502903 | 0 | \"RUNNING\" | \"SHOW LOCAL QUERIES;\" |\n+------------------+-----------------+--------+----------------------+----------------------------+----------------+-----------+-----------------------+\n\nnebula> SHOW QUERIES;\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n| 1625456037718757 | 54 | \"user1\" | \"\"192.168.x.x\":9669\" | 2021-07-05T05:51:08.691318 | 1504502 | \"RUNNING\" | \"MATCH p=(v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+---------------------------------------------------------+\n\n# The following statement returns the top 10 queries that have the longest duration.\nnebula> SHOW QUERIES | ORDER BY $-.DurationInUSec DESC | LIMIT 10;\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n| SessionID | ExecutionPlanID | User | Host | StartTime | DurationInUSec | Status | Query |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n| 1625471375320831 | 98 | \"user2\" | \"\"192.168.x.x\":9669\" | 2021-07-05T07:50:24.461779 | 2608176 | \"RUNNING\" | \"MATCH (v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n| 1625456037718757 | 99 | \"user1\" | \"\"192.168.x.x\":9669\" | 2021-07-05T07:50:24.910616 | 2159333 | \"RUNNING\" | \"MATCH (v:player)-[*1..4]-(v2) RETURN v2 AS Friends;\" |\n+------------------+-----------------+---------+----------------------+----------------------------+----------------+-----------+-------------------------------------------------------+\n
The descriptions are as follows.
Parameter DescriptionSessionID
The session ID. ExecutionPlanID
The ID of the execution plan. User
The username that executes the query. Host
The IP address and port of the Graph server that hosts the session. StartTime
The time when the query starts. DurationInUSec
The duration of the query. The unit is microsecond. Status
The current status of the query. Query
The query statement."},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/","title":"SHOW META LEADER","text":"The SHOW META LEADER
statement shows the information of the leader in the current Meta cluster.
For more information about the Meta service, see Meta service.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/#syntax","title":"Syntax","text":"SHOW META LEADER;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/19.show-meta-leader/#example","title":"Example","text":"nebula> SHOW META LEADER;\n+------------------+---------------------------+\n| Meta Leader | secs from last heart beat |\n+------------------+---------------------------+\n| \"127.0.0.1:9559\" | 3 |\n+------------------+---------------------------+\n
Parameter Description Meta Leader
Shows the information of the leader in the Meta cluster, including the IP (or hostname) and port of the server where the leader is located. secs from last heart beat
Indicates the time interval since the last heartbeat. This parameter is measured in seconds."},{"location":"3.ngql-guide/7.general-query-statements/6.show/2.show-collation/","title":"SHOW COLLATION","text":"The SHOW COLLATION
statement shows the collations supported by NebulaGraph.
Currently available types are: utf8_bin
and utf8mb4_bin
.
utf8
, the default collate is utf8_bin
.utf8mb4
, the default collate is utf8mb4_bin
.SHOW COLLATION;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/2.show-collation/#example","title":"Example","text":"nebula> SHOW COLLATION;\n+------------+---------+\n| Collation | Charset |\n+------------+---------+\n| \"utf8_bin\" | \"utf8\" |\n+------------+---------+\n
Parameter Description Collation
The name of the collation. Charset
The name of the character set with which the collation is associated."},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/","title":"SHOW CREATE SPACE","text":"The SHOW CREATE SPACE
statement shows the creating statement of the specified graph space.
For details about the graph space information, see CREATE SPACE.
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/#syntax","title":"Syntax","text":"SHOW CREATE SPACE <space_name>;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/4.show-create-space/#example","title":"Example","text":"nebula> SHOW CREATE SPACE basketballplayer;\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n| Space | Create Space |\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n| \"basketballplayer\" | \"CREATE SPACE `basketballplayer` (partition_num = 10, replica_factor = 1, charset = utf8, collate = utf8_bin, vid_type = FIXED_STRING(32))\" |\n+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/5.show-create-tag-edge/","title":"SHOW CREATE TAG/EDGE","text":"The SHOW CREATE TAG
statement shows the basic information of the specified tag. For details about the tag, see CREATE TAG.
The SHOW CREATE EDGE
statement shows the basic information of the specified edge type. For details about the edge type, see CREATE EDGE.
SHOW CREATE {TAG <tag_name> | EDGE <edge_name>};\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/5.show-create-tag-edge/#examples","title":"Examples","text":"nebula> SHOW CREATE TAG player;\n+----------+-----------------------------------+\n| Tag | Create Tag |\n+----------+-----------------------------------+\n| \"player\" | \"CREATE TAG `player` ( |\n| | `name` string NULL, |\n| | `age` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+----------+-----------------------------------+\n\nnebula> SHOW CREATE EDGE follow;\n+----------+-----------------------------------+\n| Edge | Create Edge |\n+----------+-----------------------------------+\n| \"follow\" | \"CREATE EDGE `follow` ( |\n| | `degree` int64 NULL |\n| | ) ttl_duration = 0, ttl_col = \"\"\" |\n+----------+-----------------------------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/6.show-hosts/","title":"SHOW HOSTS","text":"The SHOW HOSTS
statement shows the cluster information, including the port, status, leader, partition, and version information. You can also add the service type in the statement to view the information of the specific service.
SHOW HOSTS [GRAPH | STORAGE | META];\n
Note
For a NebulaGraph cluster installed with the source code, the version of the cluster will not be displayed in the output after executing the command SHOW HOSTS (GRAPH | STORAGE | META)
with the service name.
nebula> SHOW HOSTS;\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 8 | \"docs:5, basketballplayer:3\" | \"docs:5, basketballplayer:3\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 9 | \"basketballplayer:4, docs:5\" | \"docs:5, basketballplayer:4\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3, docs:5\" | \"docs:5, basketballplayer:3\" | \"master\" |\n+-------------+-------+----------+--------------+----------------------------------+------------------------------+---------+\n\nnebula> SHOW HOSTS GRAPH;\n+-----------+------+----------+---------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-----------+------+----------+---------+--------------+---------+\n| \"graphd\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n| \"graphd1\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n| \"graphd2\" | 9669 | \"ONLINE\" | \"GRAPH\" | \"3ba41bd\" | \"master\" |\n+-----------+------+----------+---------+--------------+---------+\n\nnebula> SHOW HOSTS STORAGE;\n+-------------+------+----------+-----------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-------------+------+----------+-----------+--------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n+-------------+------+----------+-----------+--------------+---------+\n\nnebula> SHOW HOSTS META;\n+----------+------+----------+--------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+----------+------+----------+--------+--------------+---------+\n| \"metad2\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n| \"metad0\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n| \"metad1\" | 9559 | \"ONLINE\" | \"META\" | \"3ba41bd\" | \"master\" |\n+----------+------+----------+--------+--------------+---------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/","title":"SHOW INDEX STATUS","text":"The SHOW INDEX STATUS
statement shows the status of jobs that rebuild native indexes, which helps check whether a native index is successfully rebuilt or not.
SHOW {TAG | EDGE} INDEX STATUS;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/#examples","title":"Examples","text":"nebula> SHOW TAG INDEX STATUS;\n+------------------------------------+--------------+\n| Name | Index Status |\n+------------------------------------+--------------+\n| \"date1_index\" | \"FINISHED\" |\n| \"basketballplayer_all_tag_indexes\" | \"FINISHED\" |\n| \"any_shape_geo_index\" | \"FINISHED\" |\n+------------------------------------+--------------+\n\nnebula> SHOW EDGE INDEX STATUS;\n+----------------+--------------+\n| Name | Index Status |\n+----------------+--------------+\n| \"follow_index\" | \"FINISHED\" |\n+----------------+--------------+\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/7.show-index-status/#related_topics","title":"Related topics","text":"The SHOW INDEXES
statement shows the names of existing native indexes.
SHOW {TAG | EDGE} INDEXES;\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/8.show-indexes/#examples","title":"Examples","text":"nebula> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\n\nnebula> SHOW EDGE INDEXES;\n+----------------+----------+---------+\n| Index Name | By Edge | Columns |\n+----------------+----------+---------+\n| \"follow_index\" | \"follow\" | [] |\n+----------------+----------+---------+\n
Legacy version compatibility
In NebulaGraph 2.x, SHOW TAG/EDGE INDEXES
only returns Names
.
The SHOW PARTS
statement shows the information of a specified partition or all partitions in a graph space.
SHOW PARTS [<part_id>];\n
"},{"location":"3.ngql-guide/7.general-query-statements/6.show/9.show-parts/#examples","title":"Examples","text":"nebula> SHOW PARTS;\n+--------------+--------------------+--------------------+-------+\n| Partition ID | Leader | Peers | Losts |\n+--------------+--------------------+--------------------+-------+\n| 1 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 2 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 3 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 4 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 5 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 6 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 7 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n| 8 | \"192.168.2.2:9779\" | \"192.168.2.2:9779\" | \"\" |\n| 9 | \"192.168.2.3:9779\" | \"192.168.2.3:9779\" | \"\" |\n| 10 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n+--------------+--------------------+--------------------+-------+\n\nnebula> SHOW PARTS 1;\n+--------------+--------------------+--------------------+-------+\n| Partition ID | Leader | Peers | Losts |\n+--------------+--------------------+--------------------+-------+\n| 1 | \"192.168.2.1:9779\" | \"192.168.2.1:9779\" | \"\" |\n+--------------+--------------------+--------------------+-------+\n
The descriptions are as follows.
Parameter DescriptionPartition ID
The ID of the partition. Leader
The IP (or hostname) and the port of the leader. Peers
The IPs (or hostnames) and the ports of all the replicas. Losts
The IPs (or hostnames) and the ports of replicas at fault."},{"location":"3.ngql-guide/8.clauses-and-options/group-by/","title":"GROUP BY","text":"The GROUP BY
clause can be used to aggregate data.
This topic applies to native nGQL only.
You can also use the count() function to aggregate data.
nebula> MATCH (v:player)<-[:follow]-(:player) RETURN v.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;\n+----------------------+-----+\n| Name | cnt |\n+----------------------+-----+\n| \"Tim Duncan\" | 10 |\n| \"LeBron James\" | 6 |\n| \"Tony Parker\" | 5 |\n| \"Chris Paul\" | 4 |\n| \"Manu Ginobili\" | 4 |\n+----------------------+-----+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/group-by/#syntax","title":"Syntax","text":"The GROUP BY
clause groups the rows with the same value. Then operations such as counting, sorting, and calculation can be applied.
The GROUP BY
clause works after the pipe symbol (|) and before a YIELD
clause.
| GROUP BY <var> YIELD <var>, <aggregation_function(var)>\n
The aggregation_function()
function supports avg()
, sum()
, max()
, min()
, count()
, collect()
, and std()
.
The following statement finds all the vertices connected directly to vertex \"player100\"
, groups the result set by player names, and counts how many times the name shows up in the result set.
nebula> GO FROM \"player100\" OVER follow BIDIRECT \\\n YIELD properties($$).name as Name \\\n | GROUP BY $-.Name \\\n YIELD $-.Name as Player, count(*) AS Name_Count;\n+---------------------+------------+\n| Player | Name_Count |\n+---------------------+------------+\n| \"Shaquille O'Neal\" | 1 |\n| \"Tiago Splitter\" | 1 |\n| \"Manu Ginobili\" | 2 |\n| \"Boris Diaw\" | 1 |\n| \"LaMarcus Aldridge\" | 1 |\n| \"Tony Parker\" | 2 |\n| \"Marco Belinelli\" | 1 |\n| \"Dejounte Murray\" | 1 |\n| \"Danny Green\" | 1 |\n| \"Aron Baynes\" | 1 |\n+---------------------+------------+\n
The following statement finds all the vertices connected directly to vertex \"player100\"
, groups the result set by source vertices, and returns the sum of degree values.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD src(edge) AS player, properties(edge).degree AS degree \\\n | GROUP BY $-.player \\\n YIELD sum($-.degree);\n+----------------+\n| sum($-.degree) |\n+----------------+\n| 190 |\n+----------------+\n
For more information about the sum()
function, see Built-in math functions.
The usage of GROUP BY
in the above nGQL statements that explicitly write GROUP BY
and act as grouping fields is called explicit GROUP BY
, while in openCypher, the GROUP BY
is implicit, i.e., GROUP BY
groups fields without explicitly writing GROUP BY
. The explicit GROUP BY
in nGQL is the same as the implicit GROUP BY
in openCypher, and nGQL also supports the implicit GROUP BY
. For the implicit usage of GROUP BY
, see Stack Overflow.
For example, to look up the players over 34 years old with the same length of service, you can use the following statement:
nebula> LOOKUP ON player WHERE player.age > 34 YIELD id(vertex) AS v | \\\n GO FROM $-.v OVER serve YIELD serve.start_year AS start_year, serve.end_year AS end_year | \\\n YIELD $-.start_year, $-.end_year, count(*) AS count | \\\n ORDER BY $-.count DESC | LIMIT 5;\n+---------------+-------------+-------+\n| $-.start_year | $-.end_year | count |\n+---------------+-------------+-------+\n| 2018 | 2019 | 3 |\n| 2007 | 2012 | 2 |\n| 1998 | 2004 | 2 |\n| 2017 | 2018 | 2 |\n| 2010 | 2011 | 2 |\n+---------------+-------------+-------+ \n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/","title":"INNER JOIN","text":"INNER JOIN
is a type of join query that matches records based on common column values between two tables. It is commonly used to create a result set that includes two tables based on values in their associated columns. In NebulaGraph, the INNER JOIN
clause can be explicitly used to conduct join queries between two tables, leading to more complex query results.
Note
In nGQL statements, the multi-hop query of GO
implicitly utilizes the INNER JOIN
clause. For example, in the statement GO 1 TO 2 STEPS FROM \"player101\" OVER follow YIELD $$.player.name AS name, $$.player.age AS age
, the GO
clause implicitly utilizes the INNER JOIN
clause, matching the result columns of the first-hop query starting from player101
along the follow
edge with the starting columns of the second-hop query. Then, based on the matching results, it returns name
and age
.
The INNER JOIN
clause is only applicable to the native nGQL syntax.
YIELD <column_name_list>\nFROM <first_table> INNER JOIN <second_table> ON <join_condition>\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#notes","title":"Notes","text":"To conduct an INNER JOIN
query, you need to follow these rules:
YIELD
clause to specify the returned columns, and place it before the INNER JOIN
clause.FROM
clause to specify the two tables to be joined.INNER JOIN
clause must contain the ON
clause, which specifies the join condition. The join condition only supports equi-join (i.e., ==
).<first_table>
and <second_table>
are the two tables to be joined, and the two table names cannot be the same.The following examples show how to use the INNER JOIN
clause to join the results of two queries in nGQL statements.
Firstly, the dst
column obtained from the initial LOOK UP
operation (whose value for Tony Parker has an ID of player101
) is connected with the src
column obtained from the second GO
query (which has IDs player101
and player125
). By matching the two columns where player101
appears on both sides, we obtain the resulting data set. The final request then uses a YIELD
statement YIELD $b.vid AS vid, $a.v AS v, $b.e2 AS e2
to display the information.
nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as dst, vertex AS v; \\\n $b = GO FROM 'player101', 'player125' OVER follow YIELD id($^) as src, id($$) as vid, edge AS e2; \\\n YIELD $b.vid AS vid, $a.v AS v, $b.e2 AS e2 FROM $a INNER JOIN $b ON $a.dst == $b.src;\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| vid | v | e2 |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| \"player100\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n| \"player102\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| \"player125\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player125\" @0 {degree: 95}] |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#example_2","title":"Example 2","text":"The following nGQL example utilizes the INNER JOIN
clause to combine the src
column from the first LOOKUP
query (with player101
as ID for Tony Parker
) and the src
column from the second FETCH
query (with player101
being the starting point to player100
). By matching player101
in both source columns, we obtain the resulting data set. The final request then utilizes a YIELD
clause YIELD $a.src AS src, $a.v AS v, $b.e AS e
to display the information.
nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as src, vertex AS v; \\\n $b = FETCH PROP ON follow 'player101'->'player100' YIELD src(edge) as src, edge as e; \\\n YIELD $a.src AS src, $a.v AS v, $b.e AS e FROM $a INNER JOIN $b ON $a.src == $b.src;\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| src | v | e |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n| \"player101\" | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) | [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n+-------------+-----------------------------------------------------+----------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/joins/#example_3","title":"Example 3","text":"The following example shows the process of using the INNER JOIN
clause to join the results of the LOOKUP
, GO
, and FIND PATH
clauses.
player
table using the LOOKUP ON
statement to find the vertex for player Tony Parker
, storing the ID and properties in the $a.src
and $a.v
columns, respectively.GO
statement to find player nodes that are reachable in 2-5 steps through the follow
edges from the node $a.src
. It also requires that the players corresponding to these nodes have an age greater than 30 years old. We store the IDs of these nodes in the $b.dst
column.FIND ALL PATH
statement to find all the paths that traverse the follow
edges from $a.src
to $b.dst
. We also return the paths themselves as $c.p
and the destination of each path as $c.dst
.FIND SHORTEST PATH
statement, find the shortest path from $c.dst
back to $a.src
, storing the path in $d.p
and the starting point in $d.src
.INNER JOIN
clause to join the results of steps 3 and 4 by matching the $c.dst
column with the $d.src
column. Then use the YIELD
statement YIELD $c.forward AS forwardPath, $c.dst AS end, $d.p AS backwardPath
to return the matched records of the join.nebula> $a = LOOKUP ON player WHERE player.name == 'Tony Parker' YIELD id(vertex) as src, vertex AS v; \\\n $b = GO 2 TO 5 STEPS FROM $a.src OVER follow WHERE $$.player.age > 30 YIELD id($$) AS dst; \\\n $c = (FIND ALL PATH FROM $a.src TO $b.dst OVER follow YIELD path AS p | YIELD $-.p AS forward, id(endNode($-.p)) AS dst); \\\n $d = (FIND SHORTEST PATH FROM $c.dst TO $a.src OVER follow YIELD path AS p | YIELD $-.p AS p, id(startNode($-.p)) AS src); \\\n YIELD $c.forward AS forwardPath, $c.dst AS end, $d.p AS backwordPath FROM $c INNER JOIN $d ON $c.dst == $d.src;\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n| forwardPath | end | backwordPath |\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")> | \"player102\" | <(\"player102\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player102\")> | \"player102\" | <(\"player102\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n| <(\"player101\")-[:follow@0 {}]->(\"player102\")-[:follow@0 {}]->(\"player101\")-[:follow@0 {}]->(\"player125\")> | \"player125\" | <(\"player125\")-[:follow@0 {}]->(\"player100\")-[:follow@0 {}]->(\"player101\")> |\n...\n+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-----------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/","title":"LIMIT AND SKIP","text":"The LIMIT
clause constrains the number of rows in the output. The usage of LIMIT
in native nGQL statements and openCypher compatible statements is different.
|
needs to be used before the LIMIT
clause. The offset parameter can be set or omitted directly after the LIMIT
statement.LIMIT
clause. And you can use SKIP
to indicate an offset.Note
When using LIMIT
in either syntax above, it is important to use an ORDER BY
clause that constrains the output into a unique order. Otherwise, you will get an unpredictable subset of the output.
In native nGQL, LIMIT
has general syntax and exclusive syntax in GO
statements.
In native nGQL, the general LIMIT
syntax works the same as in SQL
. The LIMIT
clause accepts one or two parameters. The values of both parameters must be non-negative integers and be used after a pipe. The syntax and description are as follows:
... | LIMIT [<offset>,] <number_rows>;\n
Parameter Description offset
The offset value. It defines the row from which to start returning. The offset starts from 0
. The default value is 0
, which returns from the first row. number_rows
It constrains the total number of returned rows. For example:
# The following example returns the top 3 rows of data from the result.\nnebula> LOOKUP ON player YIELD id(vertex)|\\\n LIMIT 3;\n+-------------+\n| id(VERTEX) |\n+-------------+\n| \"player100\" |\n| \"player101\" |\n| \"player102\" |\n+-------------+\n\n# The following example returns the 3 rows of data starting from the second row of the sorted output.\nnebula> GO FROM \"player100\" OVER follow REVERSELY \\\n YIELD properties($$).name AS Friend, properties($$).age AS Age \\\n | ORDER BY $-.Age, $-.Friend \\\n | LIMIT 1, 3;\n+-------------------+-----+\n| Friend | Age |\n+-------------------+-----+\n| \"Danny Green\" | 31 |\n| \"Aron Baynes\" | 32 |\n| \"Marco Belinelli\" | 32 |\n+-------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#limit_in_go_statements","title":"LIMIT in GO statements","text":"In addition to the general syntax in the native nGQL, the LIMIT
in the GO
statement also supports limiting the number of output results based on edges.
Syntax:
<go_statement> LIMIT <limit_list>;\n
limit_list
is a list. Elements in the list must be natural numbers, and the number of elements must be the same as the maximum number of STEPS
in the GO
statement. The following takes GO 1 TO 3 STEPS FROM \"A\" OVER * LIMIT <limit_list>
as an example to introduce this usage of LIMIT
in detail.
limit_list
must contain 3 natural numbers, such as GO 1 TO 3 STEPS FROM \"A\" OVER * LIMIT [1,2,4]
.1
in LIMIT [1,2,4]
means that the system automatically selects 1 edge to continue traversal in the first step. 2
means to select 2 edges to continue traversal in the second step. 4
indicates that 4 edges are selected to continue traversal in the third step.GO 1 TO 3 STEPS
means to return all the traversal results from the first to third steps, all the red edges and their source and destination vertices in the figure below will be matched by this GO
statement. And the yellow edges represent there is no path selected when the GO statement traverses. If it is not GO 1 TO 3 STEPS
but GO 3 STEPS
, it will only match the red edges of the third step and the vertices at both ends.In the basketballplayer dataset, the example is as follows:
nebula> GO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n LIMIT [3,3,3];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n\nnebula> GO 3 STEPS FROM \"player102\" OVER * BIDIRECT\\\n YIELD dst(edge) \\\n LIMIT [rand32(5),rand32(5),rand32(5)];\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player100\" |\n| \"player100\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#limit_in_opencypher_compatible_statements","title":"LIMIT in openCypher compatible statements","text":"In openCypher compatible statements such as MATCH
, there is no need to use a pipe when LIMIT
is used. The syntax and description are as follows:
... [SKIP <offset>] [LIMIT <number_rows>];\n
Parameter Description offset
The offset value. It defines the row from which to start returning. The offset starts from 0
. The default value is 0
, which returns from the first row. number_rows
It constrains the total number of returned rows. Both offset
and number_rows
accept expressions, but the result of the expression must be a non-negative integer.
Note
Fraction expressions composed of two integers are automatically floored to integers. For example, 8/6
is floored to 1.
LIMIT
can be used alone to return a specified number of results.
nebula> MATCH (v:player) RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Age LIMIT 5;\n+-------------------------+-----+\n| Name | Age |\n+-------------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n| \"Giannis Antetokounmpo\" | 24 |\n| \"Kyle Anderson\" | 25 |\n+-------------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#examples_of_skip","title":"Examples of SKIP","text":"SKIP
can be used alone to set the offset and return the data after the specified position.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Manu Ginobili\" | 41 |\n| \"Tony Parker\" | 36 |\n+-----------------+-----+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1+1;\n+---------------+-----+\n| Name | Age |\n+---------------+-----+\n| \"Tony Parker\" | 36 |\n+---------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/limit/#example_of_skip_and_limit","title":"Example of SKIP and LIMIT","text":"SKIP
and LIMIT
can be used together to return the specified amount of data starting from the specified position.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC SKIP 1 LIMIT 1;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/","title":"ORDER BY","text":"The ORDER BY
clause specifies the order of the rows in the output.
|
) and an ORDER BY
clause after YIELD
clause.ORDER BY
clause follows a RETURN
clause.There are two order options:
ASC
: Ascending. ASC
is the default order.DESC
: Descending.<YIELD clause>\n| ORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...];\n
Compatibility
In the native nGQL syntax, $-.
must be used after ORDER BY
. But it is not required in releases prior to 2.5.0.
nebula> FETCH PROP ON player \"player100\", \"player101\", \"player102\", \"player103\" \\\n YIELD player.age AS age, player.name AS name \\\n | ORDER BY $-.age ASC, $-.name DESC;\n+-----+---------------------+\n| age | name |\n+-----+---------------------+\n| 32 | \"Rudy Gay\" |\n| 33 | \"LaMarcus Aldridge\" |\n| 36 | \"Tony Parker\" |\n| 42 | \"Tim Duncan\" |\n+-----+---------------------+\n\nnebula> $var = GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS dst; \\\n ORDER BY $var.dst DESC;\n+-------------+\n| dst |\n+-------------+\n| \"player125\" |\n| \"player101\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#opencypher_syntax","title":"OpenCypher Syntax","text":"<RETURN clause>\nORDER BY <expression> [ASC | DESC] [, <expression> [ASC | DESC] ...];\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#examples_1","title":"Examples","text":"nebula> MATCH (v:player) RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Name DESC;\n+-----------------+-----+\n| Name | Age |\n+-----------------+-----+\n| \"Yao Ming\" | 38 |\n| \"Vince Carter\" | 42 |\n| \"Tracy McGrady\" | 39 |\n| \"Tony Parker\" | 36 |\n| \"Tim Duncan\" | 42 |\n+-----------------+-----+\n...\n\n# In the following example, nGQL sorts the rows by age first. If multiple people are of the same age, nGQL will then sort them by name.\nnebula> MATCH (v:player) RETURN v.player.age AS Age, v.player.name AS Name \\\n ORDER BY Age DESC, Name ASC;\n+-----+-------------------+\n| Age | Name |\n+-----+-------------------+\n| 47 | \"Shaquille O'Neal\" |\n| 46 | \"Grant Hill\" |\n| 45 | \"Jason Kidd\" |\n| 45 | \"Steve Nash\" |\n+-----+-------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/order-by/#order_of_null_values","title":"Order of NULL values","text":"nGQL lists NULL values at the end of the output for ascending sorting, and at the start for descending sorting.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age;\n+-----------------+----------+\n| Name | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| __NULL__ | __NULL__ |\n+-----------------+----------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"}) --> (v2) \\\n RETURN v2.player.name AS Name, v2.player.age AS Age \\\n ORDER BY Age DESC;\n+-----------------+----------+\n| Name | Age |\n+-----------------+----------+\n| __NULL__ | __NULL__ |\n| \"Manu Ginobili\" | 41 |\n| \"Tony Parker\" | 36 |\n+-----------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/","title":"RETURN","text":"The RETURN
clause defines the output of an nGQL query. To return multiple fields, separate them with commas.
RETURN
can lead a clause or a statement:
RETURN
clause can work in openCypher statements in nGQL, such as MATCH
or UNWIND
.RETURN
statement can work independently to output the result of an expression.This topic applies to the openCypher syntax in nGQL only. For native nGQL, use YIELD
.
RETURN
does not support the following openCypher features yet.
Return variables with uncommon characters, for example:
MATCH (`non-english_characters`:player) \\\nRETURN `non-english_characters`;\n
Set a pattern in the RETURN
clause and return all elements that this pattern matches, for example:
MATCH (v:player) \\\nRETURN (v)-[e]->(v2);\n
When RETURN
returns the map data structure, the order of key-value pairs is undefined.
nebula> RETURN {age: 32, name: \"Marco Belinelli\"};\n+------------------------------------+\n| {age:32,name:\"Marco Belinelli\"} |\n+------------------------------------+\n| {age: 32, name: \"Marco Belinelli\"} |\n+------------------------------------+\n\nnebula> RETURN {zage: 32, name: \"Marco Belinelli\"};\n+-------------------------------------+\n| {zage:32,name:\"Marco Belinelli\"} |\n+-------------------------------------+\n| {name: \"Marco Belinelli\", zage: 32} |\n+-------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vertices_or_edges","title":"Return vertices or edges","text":"Use the RETURN {<vertex_name> | <edge_name>}
to return vertices and edges all information.
// Return vertices\nnebula> MATCH (v:player) \\\n RETURN v;\n+---------------------------------------------------------------+\n| v |\n+---------------------------------------------------------------+\n| (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"player107\" :player{age: 32, name: \"Aron Baynes\"}) |\n| (\"player116\" :player{age: 34, name: \"LeBron James\"}) |\n| (\"player120\" :player{age: 29, name: \"James Harden\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n+---------------------------------------------------------------+\n...\n\n// Return edges\nnebula> MATCH (v:player)-[e]->() \\\n RETURN e;\n+------------------------------------------------------------------------------+\n| e |\n+------------------------------------------------------------------------------+\n| [:follow \"player104\"->\"player100\" @0 {degree: 55}] |\n| [:follow \"player104\"->\"player101\" @0 {degree: 50}] |\n| [:follow \"player104\"->\"player105\" @0 {degree: 60}] |\n| [:serve \"player104\"->\"team200\" @0 {end_year: 2009, start_year: 2007}] |\n| [:serve \"player104\"->\"team208\" @0 {end_year: 2016, start_year: 2015}] |\n+------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vids","title":"Return VIDs","text":"Use the id()
function to retrieve VIDs.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN id(v);\n+-------------+\n| id(v) |\n+-------------+\n| \"player100\" |\n+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_tag","title":"Return Tag","text":"Use the labels()
function to return the list of tags on a vertex.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN labels(v);\n+------------+\n| labels(v) |\n+------------+\n| [\"player\"] |\n+------------+\n
To retrieve the nth element in the labels(v)
list, use labels(v)[n-1]
. The following example shows how to use labels(v)[0]
to return the first tag in the list.
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN labels(v)[0];\n+--------------+\n| labels(v)[0] |\n+--------------+\n| \"player\" |\n+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_properties","title":"Return properties","text":"When returning properties of a vertex, it is necessary to specify the tag to which the properties belong because a vertex can have multiple tags and the same property name can appear on different tags.
It is possible to specify the tag of a vertex to return all properties of that tag, or to specify both the tag and a property name to return only that property of the tag.
nebula> MATCH (v:player) \\\n RETURN v.player, v.player.name, v.player.age \\\n LIMIT 3;\n+--------------------------------------+---------------------+--------------+\n| v.player | v.player.name | v.player.age |\n+--------------------------------------+---------------------+--------------+\n| {age: 33, name: \"LaMarcus Aldridge\"} | \"LaMarcus Aldridge\" | 33 |\n| {age: 25, name: \"Kyle Anderson\"} | \"Kyle Anderson\" | 25 |\n| {age: 40, name: \"Kobe Bryant\"} | \"Kobe Bryant\" | 40 |\n+--------------------------------------+---------------------+--------------+\n
When returning edge properties, it is not necessary to specify the edge type to which the properties belong, because an edge can only have one edge type.
// Return the property of a vertex\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN properties(v2);\n+----------------------------------+\n| properties(v2) |\n+----------------------------------+\n| {name: \"Spurs\"} |\n| {age: 36, name: \"Tony Parker\"} |\n| {age: 41, name: \"Manu Ginobili\"} |\n+----------------------------------+\n
// Return the property of an edge\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN e.start_year, e.degree \\\n+--------------+----------+\n| e.start_year | e.degree |\n+--------------+----------+\n| __NULL__ | 95 |\n| __NULL__ | 95 |\n| 1997 | __NULL__ |\n+--------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_edge_type","title":"Return edge type","text":"Use the type()
function to return the matched edge types.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[e]->() \\\n RETURN DISTINCT type(e);\n+----------+\n| type(e) |\n+----------+\n| \"serve\" |\n| \"follow\" |\n+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_paths","title":"Return paths","text":"Use RETURN <path_name>
to return all the information of the matched paths.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[*3]->() \\\n RETURN p;\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| p |\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:serve@0 {end_year: 2019, start_year: 2015}]->(\"team204\" :team{name: \"Spurs\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:serve@0 {end_year: 2015, start_year: 2006}]->(\"team203\" :team{name: \"Trail Blazers\"})> |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})-[:follow@0 {degree: 75}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> |\n+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_vertices_in_a_path","title":"Return vertices in a path","text":"Use the nodes()
function to return all vertices in a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN nodes(p);\n+-------------------------------------------------------------------------------------------------------------+\n| nodes(p) |\n+-------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"team204\" :team{name: \"Spurs\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"})] |\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player125\" :player{age: 41, name: \"Manu Ginobili\"})] |\n+-------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_edges_in_a_path","title":"Return edges in a path","text":"Use the relationships()
function to return all edges in a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[]->(v2) \\\n RETURN relationships(p);\n+-------------------------------------------------------------------------+\n| relationships(p) |\n+-------------------------------------------------------------------------+\n| [[:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}]] |\n| [[:follow \"player100\"->\"player101\" @0 {degree: 95}]] |\n| [[:follow \"player100\"->\"player125\" @0 {degree: 95}]] |\n+-------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_path_length","title":"Return path length","text":"Use the length()
function to return the length of a path.
nebula> MATCH p=(v:player{name:\"Tim Duncan\"})-[*..2]->(v2) \\\n RETURN p AS Paths, length(p) AS Length;\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n| Paths | Length |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> | 1 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:serve@0 {end_year: 2018, start_year: 1999}]->(\"team204\" :team{name: \"Spurs\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:serve@0 {end_year: 2019, start_year: 2018}]->(\"team215\" :team{name: \"Hornets\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 95}]->(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 90}]->(\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})-[:serve@0 {end_year: 2018, start_year: 2002}]->(\"team204\" :team{name: \"Spurs\"})> | 2 |\n| <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})-[:follow@0 {degree: 90}]->(\"player100\" :player{age: 42, name: \"Tim Duncan\"})> | 2 |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_all_elements","title":"Return all elements","text":"To return all the elements that this pattern matches, use an asterisk (*).
nebula> MATCH (v:player{name:\"Tim Duncan\"}) \\\n RETURN *;\n+----------------------------------------------------+\n| v |\n+----------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n+----------------------------------------------------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\n RETURN *;\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n| v | e | v2 |\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:follow \"player100\"->\"player101\" @0 {degree: 95}] | (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:follow \"player100\"->\"player125\" @0 {degree: 95}] | (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) | [:serve \"player100\"->\"team204\" @0 {end_year: 2016, start_year: 1997}] | (\"team204\" :team{name: \"Spurs\"}) |\n+----------------------------------------------------+-----------------------------------------------------------------------+-------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#rename_a_field","title":"Rename a field","text":"Use the AS <alias>
syntax to rename a field in the output.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[:serve]->(v2) \\\n RETURN v2.team.name AS Team;\n+---------+\n| Team |\n+---------+\n| \"Spurs\" |\n+---------+\n\nnebula> RETURN \"Amber\" AS Name;\n+---------+\n| Name |\n+---------+\n| \"Amber\" |\n+---------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_a_non-existing_property","title":"Return a non-existing property","text":"If a property matched does not exist, NULL
is returned.
nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(v2) \\\n RETURN v2.player.name, type(e), v2.player.age;\n+-----------------+----------+---------------+\n| v2.player.name | type(e) | v2.player.age |\n+-----------------+----------+---------------+\n| \"Manu Ginobili\" | \"follow\" | 41 |\n| __NULL__ | \"serve\" | __NULL__ |\n| \"Tony Parker\" | \"follow\" | 36 |\n+-----------------+----------+---------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_expression_results","title":"Return expression results","text":"To return the results of expressions such as literals, functions, or predicates, set them in a RETURN
clause.
nebula> MATCH (v:player{name:\"Tony Parker\"})-->(v2:player) \\\n RETURN DISTINCT v2.player.name, \"Hello\"+\" graphs!\", v2.player.age > 35;\n+---------------------+----------------------+--------------------+\n| v2.player.name | (\"Hello\"+\" graphs!\") | (v2.player.age>35) |\n+---------------------+----------------------+--------------------+\n| \"LaMarcus Aldridge\" | \"Hello graphs!\" | false |\n| \"Tim Duncan\" | \"Hello graphs!\" | true |\n| \"Manu Ginobili\" | \"Hello graphs!\" | true |\n+---------------------+----------------------+--------------------+\n\nnebula> RETURN 1+1;\n+-------+\n| (1+1) |\n+-------+\n| 2 |\n+-------+\n\nnebula> RETURN 1- -1;\n+----------+\n| (1--(1)) |\n+----------+\n| 2 |\n+----------+\n\nnebula> RETURN 3 > 1;\n+-------+\n| (3>1) |\n+-------+\n| true |\n+-------+\n\nnebula> RETURN 1+1, rand32(1, 5);\n+-------+-------------+\n| (1+1) | rand32(1,5) |\n+-------+-------------+\n| 2 | 1 |\n+-------+-------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/return/#return_unique_fields","title":"Return unique fields","text":"Use DISTINCT
to remove duplicate fields in the result set.
# Before using DISTINCT.\nnebula> MATCH (v:player{name:\"Tony Parker\"})--(v2:player) \\\n RETURN v2.player.name, v2.player.age;\n+---------------------+---------------+\n| v2.player.name | v2.player.age |\n+---------------------+---------------+\n| \"Manu Ginobili\" | 41 |\n| \"Boris Diaw\" | 36 |\n| \"Marco Belinelli\" | 32 |\n| \"Dejounte Murray\" | 29 |\n| \"Tim Duncan\" | 42 |\n| \"Tim Duncan\" | 42 |\n| \"LaMarcus Aldridge\" | 33 |\n| \"LaMarcus Aldridge\" | 33 |\n+---------------------+---------------+\n\n# After using DISTINCT.\nnebula> MATCH (v:player{name:\"Tony Parker\"})--(v2:player) \\\n RETURN DISTINCT v2.player.name, v2.player.age;\n+---------------------+---------------+\n| v2.player.name | v2.player.age |\n+---------------------+---------------+\n| \"Manu Ginobili\" | 41 |\n| \"Boris Diaw\" | 36 |\n| \"Marco Belinelli\" | 32 |\n| \"Dejounte Murray\" | 29 |\n| \"Tim Duncan\" | 42 |\n| \"LaMarcus Aldridge\" | 33 |\n+---------------------+---------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/sample/","title":"SAMPLE","text":"The SAMPLE
clause takes samples evenly in the result set and returns the specified amount of data.
SAMPLE
can be used in GO
statements only. The syntax is as follows:
<go_statement> SAMPLE <sample_list>;\n
sample_list
is a list. Elements in the list must be natural numbers, and the number of elements must be the same as the maximum number of STEPS
in the GO
statement. The following takes GO 1 TO 3 STEPS FROM \"A\" OVER * SAMPLE <sample_list>
as an example to introduce this usage of SAMPLE
in detail.
sample_list
must contain 3 natural numbers, such as GO 1 TO 3 STEPS FROM \"A\" OVER * SAMPLE [1,2,4]
.1
in SAMPLE [1,2,4]
means that the system automatically selects 1 edge to continue traversal in the first step. 2
means to select 2 edges to continue traversal in the second step. 4
indicates that 4 edges are selected to continue traversal in the third step. If there is no matched edge in a certain step or the number of matched edges is less than the specified number, the actual number will be returned.GO 1 TO 3 STEPS
means to return all the traversal results from the first to third steps, all the red edges and their source and destination vertices in the figure below will be matched by this GO
statement. And the yellow edges represent there is no path selected when the GO statement traverses. If it is not GO 1 TO 3 STEPS
but GO 3 STEPS
, it will only match the red edges of the third step and the vertices at both ends.In the basketballplayer dataset, the example is as follows:
nebula> GO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n SAMPLE [1,2,3];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n\nnebula> GO 1 TO 3 STEPS FROM \"player100\" \\\n OVER * \\\n YIELD properties($$).name AS NAME, properties($$).age AS Age \\\n SAMPLE [2,2,2];\n+-----------------+----------+\n| NAME | Age |\n+-----------------+----------+\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n| \"Tim Duncan\" | 42 |\n| \"Spurs\" | __NULL__ |\n| \"Manu Ginobili\" | 41 |\n| \"Spurs\" | __NULL__ |\n+-----------------+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/","title":"TTL","text":"TTL (Time To Live) is a mechanism in NebulaGraph that defines the lifespan of data. Once the data reaches its predefined lifespan, it is automatically deleted from the database. This feature is particularly suitable for data that only needs temporary storage, such as temporary sessions or cached data.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#opencypher_compatibility","title":"OpenCypher Compatibility","text":"This topic applies to native nGQL only.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#precautions","title":"Precautions","text":"TTL options and indexes have coexistence issues.
The native nGQL TTL feature has the following options.
Option Descriptionttl_col
Specifies an existing property to set a lifespan on. The data type of the property must be int
or timestamp
. ttl_duration
Specifies the timeout adds-on value in seconds. The value must be a non-negative int64 number. A property expires if the sum of its value and the ttl_duration
value is smaller than the current timestamp. If the ttl_duration
value is 0
, the property never expires.You can set ttl_use_ms
to true
in the configuration file nebula-storaged.conf
(default path: /usr/local/nightly/etc/
) to set the default unit to milliseconds. Warning
ttl_use_ms
to true
, make sure that no TTL has been set for any property, as shortening the expiration time may cause data to be erroneously deleted.ttl_use_ms
to true
, which sets the default TTL unit to milliseconds, the data type of the property specified by ttl_col
must be int
, and the property value needs to be manually converted to milliseconds. For example, when setting ttl_col
to a
, you need to convert the value of a
to milliseconds, such as when the value of a
is now()
, you need to set the value of a
to now() * 1000
.You must use the TTL options together to set a lifespan on a property.
Before using the TTL feature, you must first create a timestamp or integer property and specify it in the TTL options. NebulaGraph will not automatically create or manage this timestamp property for you.
When inserting the value of the timestamp or integer property, it is recommended to use the now()
function or the current timestamp to represent the present time.
If a tag or an edge type is already created, to set a timeout on a property bound to the tag or edge type, use ALTER
to update the tag or edge type.
# Create a tag.\nnebula> CREATE TAG IF NOT EXISTS t1 (a timestamp);\n\n# Use ALTER to update the tag and set the TTL options.\nnebula> ALTER TAG t1 TTL_COL = \"a\", TTL_DURATION = 5;\n\n# Insert a vertex with tag t1. The vertex expires 5 seconds after the insertion.\nnebula> INSERT VERTEX t1(a) VALUES \"101\":(now());\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#set_a_timeout_when_creating_a_tag_or_an_edge_type","title":"Set a timeout when creating a tag or an edge type","text":"Use TTL options in the CREATE
statement to set a timeout when creating a tag or an edge type. For more information, see CREATE TAG and CREATE EDGE.
# Create a tag and set the TTL options.\nnebula> CREATE TAG IF NOT EXISTS t2(a int, b int, c string) TTL_DURATION= 100, TTL_COL = \"a\";\n\n# Insert a vertex with tag t2. The timeout timestamp is 1648197238 (1648197138 + 100).\nnebula> INSERT VERTEX t2(a, b, c) VALUES \"102\":(1648197138, 30, \"Hello\");\n
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#data_expiration_and_deletion","title":"Data expiration and deletion","text":"Caution
NULL
, the property never expires. now()
is added to a tag or an edge type and the TTL options are set for the property, the history data related to the tag or the edge type will never expire because the value of that property for the history data is the current timestamp.Vertex property expiration has the following impact.
Since an edge can have only one edge type, once an edge property expires, the edge expires.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#data_deletion","title":"Data deletion","text":"The expired data are still stored on the disk, but queries will filter them out.
NebulaGraph automatically deletes the expired data and reclaims the disk space during the next compaction.
Note
If TTL is disabled, the corresponding data deleted after the last compaction can be queried again.
"},{"location":"3.ngql-guide/8.clauses-and-options/ttl-options/#remove_a_timeout","title":"Remove a timeout","text":"To disable TTL and remove the timeout on a property, you can use the following approaches.
nebula> ALTER TAG t1 DROP (a);\n
ttl_col
to an empty string.nebula> ALTER TAG t1 TTL_COL = \"\";\n
ttl_duration
to 0
. This operation keeps the TTL options and prevents the property from expiring and the property schema from being modified.nebula> ALTER TAG t1 TTL_DURATION = 0;\n
UNWIND
transform a list into a sequence of rows.
UNWIND
can be used as an individual statement or as a clause within a statement.
UNWIND <list> AS <alias> <RETURN clause>;\n
"},{"location":"3.ngql-guide/8.clauses-and-options/unwind/#examples","title":"Examples","text":"To transform a list.
nebula> UNWIND [1,2,3] AS n RETURN n;\n+---+\n| n |\n+---+\n| 1 |\n| 2 |\n| 3 |\n+---+\n
The UNWIND
clause in native nGQL statements.
Note
To use a UNWIND
clause in a native nGQL statement, use it after the |
operator and use the $-
prefix for variables. If you use a statement or clause after the UNWIND
clause, use the |
operator and use the $-
prefix for variables.
<statement> | UNWIND $-.<var> AS <alias> <|> <clause>;\n
The UNWIND
clause in openCypher statements.
<statement> UNWIND <list> AS <alias> <RETURN clause>\uff1b\n
To transform a list of duplicates into a unique set of rows using WITH DISTINCT
in a UNWIND
clause.
Note
WITH DISTINCT
is not available in native nGQL statements.
// Transform the list `[1,1,2,2,3,3]` into a unique set of rows, sort the rows, and then transform the rows into a list of unique values.\n\nnebula> WITH [1,1,2,2,3,3] AS n \\\n UNWIND n AS r \\\n WITH DISTINCT r AS r \\\n ORDER BY r \\\n RETURN collect(r);\n+------------+\n| collect(r) |\n+------------+\n| [1, 2, 3] |\n+------------+\n
To use an UNWIND
clause in a MATCH
statement.
// Get a list of the vertices in the matched path, transform the list into a unique set of rows, and then transform the rows into a list. \n\nnebula> MATCH p=(v:player{name:\"Tim Duncan\"})--(v2) \\\n WITH nodes(p) AS n \\\n UNWIND n AS r \\\n WITH DISTINCT r AS r \\\n RETURN collect(r);\n+----------------------------------------------------------------------------------------------------------------------+\n| collect(r) |\n+----------------------------------------------------------------------------------------------------------------------+\n| [(\"player100\" :player{age: 42, name: \"Tim Duncan\"}), (\"player101\" :player{age: 36, name: \"Tony Parker\"}), |\n|(\"team204\" :team{name: \"Spurs\"}), (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}), |\n|(\"player125\" :player{age: 41, name: \"Manu Ginobili\"}), (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}), |\n|(\"player144\" :player{age: 47, name: \"Shaquile O'Neal\"}), (\"player105\" :player{age: 31, name: \"Danny Green\"}), |\n|(\"player113\" :player{age: 29, name: \"Dejounte Murray\"}), (\"player107\" :player{age: 32, name: \"Aron Baynes\"}), |\n|(\"player109\" :player{age: 34, name: \"Tiago Splitter\"}), (\"player108\" :player{age: 36, name: \"Boris Diaw\"})] | \n+----------------------------------------------------------------------------------------------------------------------+\n
To use an UNWIND
clause in a GO
statement.
// Query the vertices in a list for the corresponding edges with a specified statement.\n\nnebula> YIELD ['player101', 'player100'] AS a | UNWIND $-.a AS b | GO FROM $-.b OVER follow YIELD edge AS e;\n+----------------------------------------------------+\n| e |\n+----------------------------------------------------+\n| [:follow \"player101\"->\"player100\" @0 {degree: 95}] |\n| [:follow \"player101\"->\"player102\" @0 {degree: 90}] |\n| [:follow \"player101\"->\"player125\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player101\" @0 {degree: 95}] |\n| [:follow \"player100\"->\"player125\" @0 {degree: 95}] |\n+----------------------------------------------------+\n
To use an UNWIND
clause in a LOOKUP
statement.
// Find all the properties of players whose age is greater than 46, get a list of unique properties, and then transform the list into rows. \n\nnebula> LOOKUP ON player \\\n WHERE player.age > 46 \\\n YIELD DISTINCT keys(vertex) as p | UNWIND $-.p as a | YIELD $-.a AS a;\n+--------+\n| a |\n+--------+\n| \"age\" |\n| \"name\" |\n+--------+\n
To use an UNWIND
clause in a FETCH
statement.
// Query player101 for all tags related to player101, get a list of the tags and then transform the list into rows.\n\nnebula> CREATE TAG hero(like string, height int);\n INSERT VERTEX hero(like, height) VALUES \"player101\":(\"deep\", 182);\n FETCH PROP ON * \"player101\" \\\n YIELD tags(vertex) as t | UNWIND $-.t as a | YIELD $-.a AS a;\n+----------+\n| a |\n+----------+\n| \"hero\" |\n| \"player\" |\n+----------+\n
To use an UNWIND
clause in a GET SUBGRAPH
statement.
// Get the subgraph including outgoing and incoming serve edges within 0~2 hops from/to player100, and transform the result into rows.\n\nnebula> GET SUBGRAPH 2 STEPS FROM \"player100\" BOTH serve \\\n YIELD edges as e | UNWIND $-.e as a | YIELD $-.a AS a;\n+----------------------------------------------+\n| a |\n+----------------------------------------------+\n| [:serve \"player100\"->\"team204\" @0 {}] |\n| [:serve \"player101\"->\"team204\" @0 {}] |\n| [:serve \"player102\"->\"team204\" @0 {}] |\n| [:serve \"player103\"->\"team204\" @0 {}] |\n| [:serve \"player105\"->\"team204\" @0 {}] |\n| [:serve \"player106\"->\"team204\" @0 {}] |\n| [:serve \"player107\"->\"team204\" @0 {}] |\n| [:serve \"player108\"->\"team204\" @0 {}] |\n| [:serve \"player109\"->\"team204\" @0 {}] |\n| [:serve \"player110\"->\"team204\" @0 {}] |\n| [:serve \"player111\"->\"team204\" @0 {}] |\n| [:serve \"player112\"->\"team204\" @0 {}] |\n| [:serve \"player113\"->\"team204\" @0 {}] |\n| [:serve \"player114\"->\"team204\" @0 {}] |\n| [:serve \"player125\"->\"team204\" @0 {}] |\n| [:serve \"player138\"->\"team204\" @0 {}] |\n| [:serve \"player104\"->\"team204\" @20132015 {}] |\n| [:serve \"player104\"->\"team204\" @20182019 {}] |\n+----------------------------------------------+\n
To use an UNWIND
clause in a FIND PATH
statement.
// Find all the vertices in the shortest path from player101 to team204 along the serve edge, and transform the result into rows. \n\nnebula> FIND SHORTEST PATH FROM \"player101\" TO \"team204\" OVER serve \\\n YIELD path as p | YIELD nodes($-.p) AS nodes | UNWIND $-.nodes AS a | YIELD $-.a AS a;\n+---------------+\n| a |\n+---------------+\n| (\"player101\") |\n| (\"team204\") |\n+---------------+\n
The WHERE
clause filters the output by conditions.
The WHERE
clause usually works in the following queries:
GO
and LOOKUP
.MATCH
and WITH
.Filtering on edge rank is a native nGQL feature. To retrieve the rank value in openCypher statements, use the rank() function, such as MATCH (:player)-[e:follow]->() RETURN rank(e);
.
Note
In the following examples, $$
and $^
are reference operators. For more information, see Operators.
Use the boolean operators NOT
, AND
, OR
, and XOR
to define conditions in WHERE
clauses. For the precedence of the operators, see Precedence.
nebula> MATCH (v:player) \\\n WHERE v.player.name == \"Tim Duncan\" \\\n XOR (v.player.age < 30 AND v.player.name == \"Yao Ming\") \\\n OR NOT (v.player.name == \"Yao Ming\" OR v.player.name == \"Tim Duncan\") \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n...\n
nebula> GO FROM \"player100\" \\\n OVER follow \\\n WHERE properties(edge).degree > 90 \\\n OR properties($$).age != 33 \\\n AND properties($$).name != \"Tony Parker\" \\\n YIELD properties($$);\n+----------------------------------+\n| properties($$) |\n+----------------------------------+\n| {age: 41, name: \"Manu Ginobili\"} |\n+----------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_properties","title":"Filter on properties","text":"Use vertex or edge properties to define conditions in WHERE
clauses.
nebula> MATCH (v:player)-[e]->(v2) \\\n WHERE v2.player.age < 25 \\\n RETURN v2.player.name, v2.player.age;\n+----------------------+---------------+\n| v2.player.name | v2.player.age |\n+----------------------+---------------+\n| \"Ben Simmons\" | 22 |\n| \"Luka Doncic\" | 20 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+---------------+\n
nebula> GO FROM \"player100\" OVER follow \\\n WHERE $^.player.age >= 42 \\\n YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
nebula> MATCH (v:player)-[e]->() \\\n WHERE e.start_year < 2000 \\\n RETURN DISTINCT v.player.name, v.player.age;\n+--------------------+--------------+\n| v.player.name | v.player.age |\n+--------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tim Duncan\" | 42 |\n| \"Grant Hill\" | 46 |\n...\n
nebula> GO FROM \"player100\" OVER follow \\\n WHERE follow.degree > 90 \\\n YIELD dst(edge);\n+-------------+\n| dst(EDGE) |\n+-------------+\n| \"player101\" |\n| \"player125\" |\n+-------------+\n
nebula> MATCH (v:player) \\\n WHERE v[toLower(\"AGE\")] < 21 \\\n RETURN v.player.name, v.player.age;\n+---------------+-------+\n| v.name | v.age |\n+---------------+-------+\n| \"Luka Doncic\" | 20 |\n+---------------+-------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_existing_properties","title":"Filter on existing properties","text":"nebula> MATCH (v:player) \\\n WHERE exists(v.player.age) \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_edge_rank","title":"Filter on edge rank","text":"In nGQL, if a group of edges has the same source vertex, destination vertex, and properties, the only thing that distinguishes them is the rank. Use rank conditions in WHERE
clauses to filter such edges.
# The following example creates test data.\nnebula> CREATE SPACE IF NOT EXISTS test (vid_type=FIXED_STRING(30));\nnebula> USE test;\nnebula> CREATE EDGE IF NOT EXISTS e1(p1 int);\nnebula> CREATE TAG IF NOT EXISTS person(p1 int);\nnebula> INSERT VERTEX person(p1) VALUES \"1\":(1);\nnebula> INSERT VERTEX person(p1) VALUES \"2\":(2);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@0:(10);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@1:(11);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@2:(12);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@3:(13);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@4:(14);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@5:(15);\nnebula> INSERT EDGE e1(p1) VALUES \"1\"->\"2\"@6:(16);\n\n# The following example use rank to filter edges and retrieves edges with a rank greater than 2.\nnebula> GO FROM \"1\" \\\n OVER e1 \\\n WHERE rank(edge) > 2 \\\n YIELD src(edge), dst(edge), rank(edge) AS Rank, properties(edge).p1 | \\\n ORDER BY $-.Rank DESC;\n+-----------+-----------+------+---------------------+\n| src(EDGE) | dst(EDGE) | Rank | properties(EDGE).p1 |\n+-----------+-----------+------+---------------------+\n| \"1\" | \"2\" | 6 | 16 |\n| \"1\" | \"2\" | 5 | 15 |\n| \"1\" | \"2\" | 4 | 14 |\n| \"1\" | \"2\" | 3 | 13 |\n+-----------+-----------+------+---------------------+\n\n# Filter edges by rank. Find follow edges with rank equal to 0.\nnebula> MATCH (v)-[e:follow]->() \\\n WHERE rank(e)==0 \\\n RETURN *;\n+------------------------------------------------------------+-----------------------------------------------------+\n| v | e |\n+------------------------------------------------------------+-----------------------------------------------------+\n| (\"player142\" :player{age: 29, name: \"Klay Thompson\"}) | [:follow \"player142\"->\"player117\" @0 {degree: 90}] |\n| (\"player139\" :player{age: 34, name: \"Marc Gasol\"}) | [:follow \"player139\"->\"player138\" @0 {degree: 99}] |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) | [:follow \"player108\"->\"player100\" @0 {degree: 80}] |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) | [:follow \"player108\"->\"player101\" @0 {degree: 80}] |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_pattern","title":"Filter on pattern","text":"nebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(t) \\\n WHERE (v)-[e]->(t:team) \\\n RETURN (v)-->();\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| (v)-->() = (v)-->() |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\nnebula> MATCH (v:player{name:\"Tim Duncan\"})-[e]->(t) \\\n WHERE NOT (v)-[e]->(t:team) \\\n RETURN (v)-->();\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| (v)-->() = (v)-->() |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n| [<(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:serve@0 {end_year: 2016, start_year: 1997}]->(\"team204\" :team{name: \"Spurs\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player101\" :player{age: 36, name: \"Tony Parker\"})>, <(\"player100\" :player{age: 42, name: \"Tim Duncan\"})-[:follow@0 {degree: 95}]->(\"player125\" :player{age: 41, name: \"Manu Ginobili\"})>] |\n+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_strings","title":"Filter on strings","text":"Use STARTS WITH
, ENDS WITH
, or CONTAINS
in WHERE
clauses to match a specific part of a string. String matching is case-sensitive.
STARTS WITH
","text":"STARTS WITH
will match the beginning of a string.
The following example uses STARTS WITH \"T\"
to retrieve the information of players whose name starts with T
.
nebula> MATCH (v:player) \\\n WHERE v.player.name STARTS WITH \"T\" \\\n RETURN v.player.name, v.player.age;\n+------------------+--------------+\n| v.player.name | v.player.age |\n+------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tiago Splitter\" | 34 |\n| \"Tim Duncan\" | 42 |\n| \"Tracy McGrady\" | 39 |\n+------------------+--------------+\n
If you use STARTS WITH \"t\"
in the preceding statement, an empty set is returned because no name in the dataset starts with the lowercase t
.
nebula> MATCH (v:player) \\\n WHERE v.player.name STARTS WITH \"t\" \\\n RETURN v.player.name, v.player.age;\n+---------------+--------------+\n| v.player.name | v.player.age |\n+---------------+--------------+\n+---------------+--------------+\nEmpty set (time spent 5080/6474 us)\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#ends_with","title":"ENDS WITH
","text":"ENDS WITH
will match the ending of a string.
The following example uses ENDS WITH \"r\"
to retrieve the information of players whose name ends with r
.
nebula> MATCH (v:player) \\\n WHERE v.player.name ENDS WITH \"r\" \\\n RETURN v.player.name, v.player.age;\n+------------------+--------------+\n| v.player.name | v.player.age |\n+------------------+--------------+\n| \"Tony Parker\" | 36 |\n| \"Tiago Splitter\" | 34 |\n| \"Vince Carter\" | 42 |\n+------------------+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#contains","title":"CONTAINS
","text":"CONTAINS
will match a certain part of a string.
The following example uses CONTAINS \"Pa\"
to match the information of players whose name contains Pa
.
nebula> MATCH (v:player) \\\n WHERE v.player.name CONTAINS \"Pa\" \\\n RETURN v.player.name, v.player.age;\n+---------------+--------------+\n| v.player.name | v.player.age |\n+---------------+--------------+\n| \"Paul George\" | 28 |\n| \"Tony Parker\" | 36 |\n| \"Paul Gasol\" | 38 |\n| \"Chris Paul\" | 33 |\n+---------------+--------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#negative_string_matching","title":"Negative string matching","text":"You can use the boolean operator NOT
to negate a string matching condition.
nebula> MATCH (v:player) \\\n WHERE NOT v.player.name ENDS WITH \"R\" \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Danny Green\" | 31 |\n| \"Tiago Splitter\" | 34 |\n| \"David West\" | 38 |\n| \"Russell Westbrook\" | 30 |\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#filter_on_lists","title":"Filter on lists","text":""},{"location":"3.ngql-guide/8.clauses-and-options/where/#match_values_in_a_list","title":"Match values in a list","text":"Use the IN
operator to check if a value is in a specific list.
nebula> MATCH (v:player) \\\n WHERE v.player.age IN range(20,25) \\\n RETURN v.player.name, v.player.age;\n+-------------------------+--------------+\n| v.player.name | v.player.age |\n+-------------------------+--------------+\n| \"Ben Simmons\" | 22 |\n| \"Giannis Antetokounmpo\" | 24 |\n| \"Kyle Anderson\" | 25 |\n| \"Joel Embiid\" | 25 |\n| \"Kristaps Porzingis\" | 23 |\n| \"Luka Doncic\" | 20 |\n+-------------------------+--------------+\n\nnebula> LOOKUP ON player \\\n WHERE player.age IN [25,28] \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Kyle Anderson\" | 25 |\n| \"Damian Lillard\" | 28 |\n| \"Joel Embiid\" | 25 |\n| \"Paul George\" | 28 |\n| \"Ricky Rubio\" | 28 |\n+-------------------------+------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/where/#match_values_not_in_a_list","title":"Match values not in a list","text":"Use NOT
before IN
to rule out the values in a list.
nebula> MATCH (v:player) \\\n WHERE v.player.age NOT IN range(20,25) \\\n RETURN v.player.name AS Name, v.player.age AS Age \\\n ORDER BY Age;\n+---------------------+-----+\n| Name | Age |\n+---------------------+-----+\n| \"Kyrie Irving\" | 26 |\n| \"Cory Joseph\" | 27 |\n| \"Damian Lillard\" | 28 |\n| \"Paul George\" | 28 |\n| \"Ricky Rubio\" | 28 |\n+---------------------+-----+\n...\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/","title":"WITH","text":"The WITH
clause can retrieve the output from a query part, process it, and pass it to the next query part as the input.
This topic applies to openCypher syntax only.
Note
WITH
has a similar function with the Pipe symbol in native nGQL, but they work in different ways. DO NOT use pipe symbols in the openCypher syntax or use WITH
in native nGQL statements.
Use a WITH
clause to combine statements and transfer the output of a statement as the input of another statement.
The following statement:
nodes()
function.nebula> MATCH p=(v:player{name:\"Tim Duncan\"})--() \\\n WITH nodes(p) AS n \\\n UNWIND n AS n1 \\\n RETURN DISTINCT n1;\n+-----------------------------------------------------------+\n| n1 |\n+-----------------------------------------------------------+\n| (\"player100\" :player{age: 42, name: \"Tim Duncan\"}) |\n| (\"player101\" :player{age: 36, name: \"Tony Parker\"}) |\n| (\"team204\" :team{name: \"Spurs\"}) |\n| (\"player102\" :player{age: 33, name: \"LaMarcus Aldridge\"}) |\n| (\"player125\" :player{age: 41, name: \"Manu Ginobili\"}) |\n| (\"player104\" :player{age: 32, name: \"Marco Belinelli\"}) |\n| (\"player144\" :player{age: 47, name: \"Shaquille O'Neal\"}) |\n| (\"player105\" :player{age: 31, name: \"Danny Green\"}) |\n| (\"player113\" :player{age: 29, name: \"Dejounte Murray\"}) |\n| (\"player107\" :player{age: 32, name: \"Aron Baynes\"}) |\n| (\"player109\" :player{age: 34, name: \"Tiago Splitter\"}) |\n| (\"player108\" :player{age: 36, name: \"Boris Diaw\"}) |\n+-----------------------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#example_2","title":"Example 2","text":"The following statement:
player100
.labels()
function.nebula> MATCH (v) \\\n WHERE id(v)==\"player100\" \\\n WITH labels(v) AS tags_unf \\\n UNWIND tags_unf AS tags_f \\\n RETURN tags_f;\n+----------+\n| tags_f |\n+----------+\n| \"player\" |\n+----------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#filter_composite_queries","title":"Filter composite queries","text":"WITH
can work as a filter in the middle of a composite query.
nebula> MATCH (v:player)-->(v2:player) \\\n WITH DISTINCT v2 AS v2, v2.player.age AS Age \\\n ORDER BY Age \\\n WHERE Age<25 \\\n RETURN v2.player.name AS Name, Age;\n+----------------------+-----+\n| Name | Age |\n+----------------------+-----+\n| \"Luka Doncic\" | 20 |\n| \"Ben Simmons\" | 22 |\n| \"Kristaps Porzingis\" | 23 |\n+----------------------+-----+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#process_the_output_before_using_collect","title":"Process the output before using collect()","text":"Use a WITH
clause to sort and limit the output before using collect()
to transform the output into a list.
nebula> MATCH (v:player) \\\n WITH v.player.name AS Name \\\n ORDER BY Name DESC \\\n LIMIT 3 \\\n RETURN collect(Name);\n+-----------------------------------------------+\n| collect(Name) |\n+-----------------------------------------------+\n| [\"Yao Ming\", \"Vince Carter\", \"Tracy McGrady\"] |\n+-----------------------------------------------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/with/#use_with_return","title":"Use with RETURN","text":"Set an alias using a WITH
clause, and then output the result through a RETURN
clause.
nebula> WITH [1, 2, 3] AS `list` RETURN 3 IN `list` AS r;\n+------+\n| r |\n+------+\n| true |\n+------+\n\nnebula> WITH 4 AS one, 3 AS two RETURN one > two AS result;\n+--------+\n| result |\n+--------+\n| true |\n+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/yield/","title":"YIELD","text":"YIELD
defines the output of an nGQL query.
YIELD
can lead a clause or a statement:
YIELD
clause works in nGQL statements such as GO
, FETCH
, or LOOKUP
and must be defined to return the result.YIELD
statement works in a composite query or independently.This topic applies to native nGQL only. For the openCypher syntax, use RETURN
.
YIELD
has different functions in openCypher and nGQL.
In openCypher, YIELD
is used in the CALL[\u2026YIELD]
clause to specify the output of the procedure call.
Note
NGQL does not support CALL[\u2026YIELD]
yet.
YIELD
works like RETURN
in openCypher.Note
In the following examples, $$
and $-
are property references. For more information, see Reference to properties.
YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...];\n
Parameter Description DISTINCT
Aggregates the output and makes the statement return a distinct result set. col
A field to be returned. If no alias is set, col
will be a column name in the output. alias
An alias for col
. It is set after the keyword AS
and will be a column name in the output."},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_yield_clause_in_a_statement","title":"Use a YIELD clause in a statement","text":"YIELD
with GO
:nebula> GO FROM \"player100\" OVER follow \\\n YIELD properties($$).name AS Friend, properties($$).age AS Age;\n+-----------------+-----+\n| Friend | Age |\n+-----------------+-----+\n| \"Tony Parker\" | 36 |\n| \"Manu Ginobili\" | 41 |\n+-----------------+-----+\n
YIELD
with FETCH
:nebula> FETCH PROP ON player \"player100\" \\\n YIELD properties(vertex).name;\n+-------------------------+\n| properties(VERTEX).name |\n+-------------------------+\n| \"Tim Duncan\" |\n+-------------------------+\n
YIELD
with LOOKUP
:nebula> LOOKUP ON player WHERE player.name == \"Tony Parker\" \\\n YIELD properties(vertex).name, properties(vertex).age;\n+-------------------------+------------------------+\n| properties(VERTEX).name | properties(VERTEX).age |\n+-------------------------+------------------------+\n| \"Tony Parker\" | 36 |\n+-------------------------+------------------------+\n
YIELD [DISTINCT] <col> [AS <alias>] [, <col> [AS <alias>] ...]\n[WHERE <conditions>];\n
Parameter Description DISTINCT
Aggregates the output and makes the statement return a distinct result set. col
A field to be returned. If no alias is set, col
will be a column name in the output. alias
An alias for col
. It is set after the keyword AS
and will be a column name in the output. conditions
Conditions set in a WHERE
clause to filter the output. For more information, see WHERE
."},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_yield_statement_in_a_composite_query","title":"Use a YIELD statement in a composite query","text":"In a composite query, a YIELD
statement accepts, filters, and modifies the result set of the preceding statement, and then outputs it.
The following query finds the players that \"player100\" follows and calculates their average age.
nebula> GO FROM \"player100\" OVER follow \\\n YIELD dst(edge) AS ID \\\n | FETCH PROP ON player $-.ID \\\n YIELD properties(vertex).age AS Age \\\n | YIELD AVG($-.Age) as Avg_age, count(*)as Num_friends;\n+---------+-------------+\n| Avg_age | Num_friends |\n+---------+-------------+\n| 38.5 | 2 |\n+---------+-------------+\n
The following query finds the players that \"player101\" follows with the follow degrees greater than 90.
nebula> $var1 = GO FROM \"player101\" OVER follow \\\n YIELD properties(edge).degree AS Degree, dst(edge) as ID; \\\n YIELD $var1.ID AS ID WHERE $var1.Degree > 90;\n+-------------+\n| ID |\n+-------------+\n| \"player100\" |\n| \"player125\" |\n+-------------+\n
The following query finds the vertices in the player that are older than 30 and younger than 32, and returns the de-duplicate results.
nebula> LOOKUP ON player \\\n WHERE player.age < 32 and player.age >30 \\\n YIELD DISTINCT properties(vertex).age as v;\n+--------+\n| v |\n+--------+\n| 31 |\n+--------+\n
"},{"location":"3.ngql-guide/8.clauses-and-options/yield/#use_a_standalone_yield_statement","title":"Use a standalone YIELD statement","text":"A YIELD
statement can calculate a valid expression and output the result.
nebula> YIELD rand32(1, 6);\n+-------------+\n| rand32(1,6) |\n+-------------+\n| 3 |\n+-------------+\n\nnebula> YIELD \"Hel\" + \"\\tlo\" AS string1, \", World!\" AS string2;\n+-------------+------------+\n| string1 | string2 |\n+-------------+------------+\n| \"Hel lo\" | \", World!\" |\n+-------------+------------+\n\nnebula> YIELD hash(\"Tim\") % 100;\n+-----------------+\n| (hash(Tim)%100) |\n+-----------------+\n| 42 |\n+-----------------+\n\nnebula> YIELD \\\n CASE 2+3 \\\n WHEN 4 THEN 0 \\\n WHEN 5 THEN 1 \\\n ELSE -1 \\\n END \\\n AS result;\n+--------+\n| result |\n+--------+\n| 1 |\n+--------+\n\nnebula> YIELD 1- -1;\n+----------+\n| (1--(1)) |\n+----------+\n| 2 |\n+----------+\n
"},{"location":"3.ngql-guide/9.space-statements/1.create-space/","title":"CREATE SPACE","text":"Graph spaces are used to store data in a physically isolated way in NebulaGraph, which is similar to the database concept in MySQL. The CREATE SPACE
statement can create a new graph space or clone the schema of an existing graph space.
Only the God role can use the CREATE SPACE
statement. For more information, see AUTHENTICATION.
CREATE SPACE [IF NOT EXISTS] <graph_space_name> (\n [partition_num = <partition_number>,]\n [replica_factor = <replica_number>,]\n vid_type = {FIXED_STRING(<N>) | INT[64]}\n )\n [COMMENT = '<comment>']\n
Parameter Description IF NOT EXISTS
Detects if the related graph space exists. If it does not exist, a new one will be created. The graph space existence detection here only compares the graph space name (excluding properties). <graph_space_name>
1. Uniquely identifies a graph space in a NebulaGraph instance. 2. Space names cannot be modified after they are set. 3. By default, the name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. However, it cannot include special characters other than the underscore (_), and cannot start with a number. 4. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and do not include periods (.
) within the pair of backticks (`). For more information, see Keywords and reserved words. Note:1. If you name a space in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a space name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . partition_num
Specifies the number of partitions in each replica. The suggested value is 20 times (2 times for HDD) the number of the hard disks in the cluster. For example, if you have three hard disks in the cluster, we recommend that you set 60 partitions. The default value is 100. replica_factor
Specifies the number of replicas in the cluster. The suggested number is 3 in a production environment and 1 in a test environment. The replica number must be an odd number for the need of quorum-based voting. The default value is 1. vid_type
A required parameter. Specifies the VID type in a graph space. Available values are FIXED_STRING(N)
and INT64
. INT
equals to INT64
. `FIXED_STRING(<N>)
specifies the VID as a string, while INT64
specifies it as an integer. N
represents the maximum length of the VIDs. If you set a VID that is longer than N
bytes, NebulaGraph throws an error. Note, for UTF-8 chars, the length may vary in different cases, i.e. a UTF-8 Chinese char is 3 byte, this means 11 Chinese chars(length-33) will exeed a FIXED_STRING(32) vid defination. COMMENT
The remarks of the graph space. The maximum length is 256 bytes. By default, there is no comments on a space. Caution
Restrictions on VID type change and VID length:
INT64
, and the String type is not allowed. For NebulaGraph v2.x, both INT64
and FIXED_STRING(<N>)
VID types are allowed. You must specify the VID type when creating a graph space, and use the same VID type in INSERT
statements, otherwise, an error message Wrong vertex id type: 1001
occurs.N
characters. If it exceeds N
, NebulaGraph throws The VID must be a 64-bit integer or a string fitting space vertex id length limit.
.If the Host not enough!
error appears, the immediate cause is that the number of online storage hosts is less than the value of replica_factor
specified when creating a graph space. In this case, you can use the SHOW HOSTS
command to see if the following situations occur:
replica_factor
can only be specified to 1
. Or create a graph space after storage hosts are scaled out. ADD HOSTS
is not executed to activate it. In this case, run SHOW HOSTS
to locate the new storage host information and then run ADD HOSTS
to activate it. A graph space can be created after there are enough storage hosts.SHOW HOSTS
, troubleshooting is needed.Legacy version compatibility
For NebulaGraph v2.x before v2.5.0, vid_type
is optional and defaults to FIXED_STRING(8)
.
Note
graph_space_name
, partition_num
, replica_factor
, vid_type
, and comment
cannot be modified once set. To modify them, drop the current working graph space with DROP SPACE
and create a new one with CREATE SPACE
.
CREATE SPACE [IF NOT EXISTS] <new_graph_space_name> AS <old_graph_space_name>;\n
Parameter Description IF NOT EXISTS
Detects if the new graph space exists. If it does not exist, the new one will be created. The graph space existence detection here only compares the graph space name (excluding properties). <new_graph_space_name>
The name of the graph space that is newly created. By default, the space name only supports 1-4 byte UTF-8 encoded characters, including English letters (case sensitive), numbers, Chinese characters, etc. But special characters can only use underscore, and cannot start with a number. To use special characters, reserved keywords, or start with a number, quote the entire name with backticks (`) and cannot use periods (.
). For more information, see Keywords and reserved words. When a new graph space is created, the schema of the old graph space <old_graph_space_name>
will be cloned, including its parameters (the number of partitions and replicas, etc.), Tag, Edge type and native indexes. Note:1. If you name a space in Chinese and encounter a SyntaxError
, you need to quote the Chinese characters with backticks (`). 2. To include a backtick (`) in a space name, use a backslash to escape the backtick, such as \\`; to include a backslash, the backslash itself also needs to be escaped, such as \\ . <old_graph_space_name>
The name of the graph space that already exists."},{"location":"3.ngql-guide/9.space-statements/1.create-space/#examples","title":"Examples","text":"# The following example creates a graph space with a specified VID type and the maximum length. Other fields still use the default values.\nnebula> CREATE SPACE IF NOT EXISTS my_space_1 (vid_type=FIXED_STRING(30));\n\n# The following example creates a graph space with a specified partition number, replica number, and VID type.\nnebula> CREATE SPACE IF NOT EXISTS my_space_2 (partition_num=15, replica_factor=1, vid_type=FIXED_STRING(30));\n\n# The following example creates a graph space with a specified partition number, replica number, and VID type, and adds a comment on it.\nnebula> CREATE SPACE IF NOT EXISTS my_space_3 (partition_num=15, replica_factor=1, vid_type=FIXED_STRING(30)) comment=\"Test the graph space\";\n\n# Clone a graph space.\nnebula> CREATE SPACE IF NOT EXISTS my_space_4 as my_space_3;\nnebula> SHOW CREATE SPACE my_space_4;\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| Space | Create Space |\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| \"my_space_4\" | \"CREATE SPACE `my_space_4` (partition_num = 15, replica_factor = 1, charset = utf8, collate = utf8_bin, vid_type = FIXED_STRING(30)) comment = '\u6d4b\u8bd5\u56fe\u7a7a\u95f4'\" |\n+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+\n
"},{"location":"3.ngql-guide/9.space-statements/1.create-space/#implementation_of_the_operation","title":"Implementation of the operation","text":"Caution
Trying to use a newly created graph space may fail because the creation is implemented asynchronously. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs
parameter in the configuration files for all services. If the heartbeat interval is too short (i.e., less than 5 seconds), disconnection between peers may happen because of the misjudgment of machines in the distributed system.
On some large clusters, the partition distribution is possibly unbalanced because of the different startup times. You can run the following command to do a check of the machine distribution.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3, test:5\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 9 | \"basketballplayer:4, test:5\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 3 | \"basketballplayer:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n
To balance the request loads, use the following command.
nebula> BALANCE LEADER;\nnebula> SHOW HOSTS;\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+-----------+----------+--------------+--------------------------------+--------------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 7 | \"basketballplayer:3, test:4\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 7 | \"basketballplayer:4, test:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 6 | \"basketballplayer:3, test:3\" | \"basketballplayer:10, test:10\" | \"master\" |\n+-------------+------+----------+--------------+--------------------------------+--------------------------------+---------+\n
"},{"location":"3.ngql-guide/9.space-statements/2.use-space/","title":"USE","text":"USE
specifies a graph space as the current working graph space for subsequent queries.
Running the USE
statement requires some privileges for the graph space. Otherwise, NebulaGraph throws an error.
USE <graph_space_name>;\n
"},{"location":"3.ngql-guide/9.space-statements/2.use-space/#examples","title":"Examples","text":"# The following example creates two sample spaces.\nnebula> CREATE SPACE IF NOT EXISTS space1 (vid_type=FIXED_STRING(30));\nnebula> CREATE SPACE IF NOT EXISTS space2 (vid_type=FIXED_STRING(30));\n\n# The following example specifies space1 as the current working graph space.\nnebula> USE space1;\n\n# The following example specifies space2 as the current working graph space. Hereafter, you cannot read any data from space1, because these vertices and edges being traversed have no relevance with space1.\nnebula> USE space2;\n
Caution
You cannot use two graph spaces in one statement.
Different from Fabric Cypher, graph spaces in NebulaGraph are fully isolated from each other. Making a graph space as the working graph space prevents you from accessing other spaces. The only way to traverse in a new graph space is to switch by the USE
statement. In Fabric Cypher, you can use two graph spaces in one statement (using the USE + CALL
syntax). But in NebulaGraph, you can only use one graph space in one statement.
SHOW SPACES
lists all the graph spaces in the NebulaGraph examples.
SHOW SPACES;\n
"},{"location":"3.ngql-guide/9.space-statements/3.show-spaces/#example","title":"Example","text":"nebula> SHOW SPACES;\n+--------------------+\n| Name |\n+--------------------+\n| \"cba\" |\n| \"basketballplayer\" |\n+--------------------+\n
To create graph spaces, see CREATE SPACE.
"},{"location":"3.ngql-guide/9.space-statements/4.describe-space/","title":"DESCRIBE SPACE","text":"DESCRIBE SPACE
returns the information about the specified graph space.
You can use DESC
instead of DESCRIBE
for short.
DESC[RIBE] SPACE <graph_space_name>;\n
The DESCRIBE SPACE
statement is different from the SHOW SPACES
statement. For details about SHOW SPACES
, see SHOW SPACES.
nebula> DESCRIBE SPACE basketballplayer;\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n| ID | Name | Partition Number | Replica Factor | Charset | Collate | Vid Type | Comment |\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n| 1 | \"basketballplayer\" | 10 | 1 | \"utf8\" | \"utf8_bin\" | \"FIXED_STRING(32)\" | |\n+----+--------------------+------------------+----------------+---------+------------+--------------------+---------+\n
"},{"location":"3.ngql-guide/9.space-statements/5.drop-space/","title":"DROP SPACE","text":"DROP SPACE
deletes the specified graph space and everything in it.
Note
DROP SPACE
can only delete the specified logic graph space while retain all the data on the hard disk by modifying the value of auto_remove_invalid_space
to false
in the Storage service configuration file. For more information, see Storage configuration.
Warning
After you execute DROP SPACE
, even if the snapshot contains data of the graph space, the data of the graph space cannot be recovered.
Only the God role can use the DROP SPACE
statement. For more information, see AUTHENTICATION.
DROP SPACE [IF EXISTS] <graph_space_name>;\n
You can use the IF EXISTS
keywords when dropping spaces. These keywords automatically detect if the related graph space exists. If it exists, it will be deleted. Otherwise, no graph space will be deleted.
Legacy version compatibility
In NebulaGraph versions earlier than 3.1.0, the DROP SPACE
statement does not remove all the files and directories from the disk by default.
Danger
BE CAUTIOUS about running the DROP SPACE
statement.
Q: Why is my disk space not freed after executing the 'DROP SPACE' statement and deleting a graph space?
A: For NebulaGraph version earlier than 3.1.0, DROP SPACE
can only delete the specified logic graph space and does not delete the files and directories on the disk. To delete the files and directories on the disk, manually delete the corresponding file path. The file path is located in <nebula_graph_install_path>/data/storage/nebula/<space_id>
. The <space_id>
can be viewed via DESCRIBE SPACE {space_name}
.
CLEAR SPACE
deletes the vertices and edges in a graph space, but does not delete the graph space itself and the schema information.
Note
It is recommended to execute SUBMIT JOB COMPACT immediately after executing the CLEAR SPACE
operation improve the query performance. Note that the COMPACT operation may affect query performance, and it is recommended to perform this operation during low business hours (e.g., early morning).
Only the God role has the permission to run CLEAR SPACE
.
CLEAR SPACE
with caution.CLEAR SPACE
is not an atomic operation. If an error occurs, re-run CLEAR SPACE
to avoid data remaining.storage_client_timeout_ms
parameter in the Graph Service configuration.CLEAR SPACE
, writing data into the graph space is not automatically prohibited. Such write operations can result in incomplete data clearing, and the residual data can be damaged.Note
The NebulaGraph Community Edition does not support blocking data writing while allowing CLEAR SPACE
.
CLEAR SPACE [IF EXISTS] <space_name>;\n
Parameter/Option Description IF EXISTS
Check whether the graph space to be cleared exists. If it exists, continue to clear it. If it does not exist, the execution finishes, and a message indicating that the execution succeeded is displayed. If IF EXISTS
is not set and the graph space does not exist, the CLEAR SPACE
statement fails to execute, and an error occurs. space_name
The name of the space to be cleared. Example:
CLEAR SPACE basketballplayer;\n
"},{"location":"3.ngql-guide/9.space-statements/6.clear-space/#data_reserved","title":"Data reserved","text":"CLEAR SPACE
does not delete the following data in a graph space:
The following example shows what CLEAR SPACE
deletes and reserves.
# Enter the graph space basketballplayer.\nnebula [(none)]> use basketballplayer;\nExecution succeeded\n\n# List tags and Edge types.\nnebula[basketballplayer]> SHOW TAGS;\n+----------+\n| Name |\n+----------+\n| \"player\" |\n| \"team\" |\n+----------+\nGot 2 rows\n\nnebula[basketballplayer]> SHOW EDGES;\n+----------+\n| Name |\n+----------+\n| \"follow\" |\n| \"serve\" |\n+----------+\nGot 2 rows\n\n# Submit a job to make statistics of the graph space.\nnebula[basketballplayer]> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 4 |\n+------------+\nGot 1 rows\n\n# Check the statistics.\nnebula[basketballplayer]> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 51 |\n| \"Tag\" | \"team\" | 30 |\n| \"Edge\" | \"follow\" | 81 |\n| \"Edge\" | \"serve\" | 152 |\n| \"Space\" | \"vertices\" | 81 |\n| \"Space\" | \"edges\" | 233 |\n+---------+------------+-------+\nGot 6 rows\n\n# List tag indexes.\nnebula[basketballplayer]> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\nGot 2 rows\n\n# ----------------------- Dividing line for CLEAR SPACE -----------------------\n# Run CLEAR SPACE to clear the graph space basketballplayer.\nnebula[basketballplayer]> CLEAR SPACE basketballplayer;\nExecution succeeded\n\n# Update the statistics.\nnebula[basketballplayer]> SUBMIT JOB STATS;\n+------------+\n| New Job Id |\n+------------+\n| 5 |\n+------------+\nGot 1 rows\n\n# Check the statistics. The tags and edge types still exist, but all the vertices and edges are gone.\nnebula[basketballplayer]> SHOW STATS;\n+---------+------------+-------+\n| Type | Name | Count |\n+---------+------------+-------+\n| \"Tag\" | \"player\" | 0 |\n| \"Tag\" | \"team\" | 0 |\n| \"Edge\" | \"follow\" | 0 |\n| \"Edge\" | \"serve\" | 0 |\n| \"Space\" | \"vertices\" | 0 |\n| \"Space\" | \"edges\" | 0 |\n+---------+------------+-------+\nGot 6 rows\n\n# Try to list the tag indexes. They still exist.\nnebula[basketballplayer]> SHOW TAG INDEXES;\n+------------------+----------+----------+\n| Index Name | By Tag | Columns |\n+------------------+----------+----------+\n| \"player_index_0\" | \"player\" | [] |\n| \"player_index_1\" | \"player\" | [\"name\"] |\n+------------------+----------+----------+\nGot 2 rows (time spent 523/978 us)\n
"},{"location":"4.deployment-and-installation/1.resource-preparations/","title":"Prepare resources for compiling, installing, and running NebulaGraph","text":"This topic describes the requirements and suggestions for compiling and installing NebulaGraph, as well as how to estimate the resource you need to reserve for running a NebulaGraph cluster.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#about_storage_devices","title":"About storage devices","text":"NebulaGraph is designed and implemented for NVMe SSD. All default parameters are optimized for the SSD devices and require extremely high IOPS and low latency.
Starting with 3.0.2, you can run containerized NebulaGraph databases on Docker Desktop for ARM macOS or on ARM Linux servers.
Caution
We do not recommend you deploy NebulaGraph on Docker Desktop for Windows due to its subpar performance. For details, see #12401.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#requirements_for_compiling_the_source_code","title":"Requirements for compiling the source code","text":""},{"location":"4.deployment-and-installation/1.resource-preparations/#hardware_requirements_for_compiling_nebulagraph","title":"Hardware requirements for compiling NebulaGraph","text":"Item Requirement CPU architecture x86_64 Memory 4 GB Disk 10 GB, SSD"},{"location":"4.deployment-and-installation/1.resource-preparations/#supported_operating_systems_for_compiling_nebulagraph","title":"Supported operating systems for compiling NebulaGraph","text":"For now, we can only compile NebulaGraph in the Linux system. We recommend that you use any Linux system with kernel version 4.15
or above.
Note
To install NebulaGraph on Linux systems with kernel version lower than required, use RPM/DEB packages or TAR files.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#software_requirements_for_compiling_nebulagraph","title":"Software requirements for compiling NebulaGraph","text":"You must have the correct version of the software listed below to compile NebulaGraph. If they are not as required or you are not sure, follow the steps in Prepare software for compiling NebulaGraph to get them ready.
Software Version Note glibc 2.17 or above You can runldd --version
to check the glibc version. make Any stable version - m4 Any stable version - git Any stable version - wget Any stable version - unzip Any stable version - xz Any stable version - readline-devel Any stable version - ncurses-devel Any stable version - zlib-devel Any stable version - g++ 8.5.0 or above You can run gcc -v
to check the gcc version. cmake 3.14.0 or above You can run cmake --version
to check the cmake version. curl Any stable version - redhat-lsb-core Any stable version - libstdc++-static Any stable version Only needed in CentOS 8+, RedHat 8+, and Fedora systems. libasan Any stable version Only needed in CentOS 8+, RedHat 8+, and Fedora systems. bzip2 Any stable version - Other third-party software will be automatically downloaded and installed to the build
directory at the configure (cmake) stage.
If part of the dependencies are missing or the versions does not meet the requirements, manually install them with the following steps. You can skip unnecessary dependencies or steps according to your needs.
Install dependencies.
$ yum update\n$ yum install -y make \\\n m4 \\\n git \\\n wget \\\n unzip \\\n xz \\\n readline-devel \\\n ncurses-devel \\\n zlib-devel \\\n gcc \\\n gcc-c++ \\\n cmake \\\n curl \\\n redhat-lsb-core \\\n bzip2\n // For CentOS 8+, RedHat 8+, and Fedora, install libstdc++-static and libasan as well\n$ yum install -y libstdc++-static libasan\n
$ apt-get update\n$ apt-get install -y make \\\n m4 \\\n git \\\n wget \\\n unzip \\\n xz-utils \\\n curl \\\n lsb-core \\\n build-essential \\\n libreadline-dev \\\n ncurses-dev \\\n cmake \\\n bzip2\n
Check if the GCC and cmake on your host are in the right version. See Software requirements for compiling NebulaGraph for the required versions.
$ g++ --version\n$ cmake --version\n
If your GCC and CMake are in the right versions, then you are all set and you can ignore the subsequent steps. If they are not, select and perform the needed steps as follows.
If the CMake version is incorrect, visit the CMake official website to install the required version.
If the G++ version is incorrect, visit the G++ official website or follow the instructions below to to install the required version.
For CentOS users, run:
yum install centos-release-scl\nyum install devtoolset-11\nscl enable devtoolset-11 'bash'\n
For Ubuntu users, run:
add-apt-repository ppa:ubuntu-toolchain-r/test\napt install gcc-11 g++-11\n
For now, we can only install NebulaGraph in the Linux system. To install NebulaGraph in a test environment, we recommend that you use any Linux system with kernel version 3.9
or above.
For example, for a single-machine test environment, you can deploy 1 metad, 1 storaged, and 1 graphd processes in the machine.
For a more common test environment, such as a cluster of 3 machines (named as A, B, and C), you can deploy NebulaGraph as follows:
Machine name Number of metad Number of storaged Number of graphd A 1 1 1 B None 1 1 C None 1 1"},{"location":"4.deployment-and-installation/1.resource-preparations/#requirements_and_suggestions_for_installing_nebulagraph_in_production_environments","title":"Requirements and suggestions for installing NebulaGraph in production environments","text":""},{"location":"4.deployment-and-installation/1.resource-preparations/#hardware_requirements_for_production_environments","title":"Hardware requirements for production environments","text":"Item Requirement CPU architecture x86_64 Number of CPU core 48 Memory 256 GB Disk 2 * 1.6 TB, NVMe SSD"},{"location":"4.deployment-and-installation/1.resource-preparations/#supported_operating_systems_for_production_environments","title":"Supported operating systems for production environments","text":"For now, we can only install NebulaGraph in the Linux system. To install NebulaGraph in a production environment, we recommend that you use any Linux system with kernel version 3.9 or above.
Users can adjust some of the kernel parameters to better accommodate the need for running NebulaGraph. For more information, see kernel configuration.
"},{"location":"4.deployment-and-installation/1.resource-preparations/#suggested_service_architecture_for_production_environments","title":"Suggested service architecture for production environments","text":"Danger
DO NOT deploy a single cluster across IDCs (The Enterprise Edtion supports data synchronization between clusters across IDCs).
Process Suggested number metad (the metadata service process) 3 storaged (the storage service process) 3 or more graphd (the query engine service process) 3 or moreEach metad process automatically creates and maintains a replica of the metadata. Usually, you need to deploy three metad processes and only three.
The number of storaged processes does not affect the number of graph space replicas.
Users can deploy multiple processes on a single machine. For example, on a cluster of 5 machines (named as A, B, C, D, and E), you can deploy NebulaGraph as follows:
Machine name Number of metad Number of storaged Number of graphd A 1 1 1 B 1 1 1 C 1 1 1 D None 1 1 E None 1 1"},{"location":"4.deployment-and-installation/1.resource-preparations/#capacity_requirements_for_running_a_nebulagraph_cluster","title":"Capacity requirements for running a NebulaGraph cluster","text":"Users can estimate the memory, disk space, and partition number needed for a NebulaGraph cluster of 3 replicas as follows.
Resource Unit How to estimate Description Disk space for a cluster Bytesthe_sum_of_edge_number_and_vertex_number
* average_bytes_of_properties
* 7.5 * 120% For more information, see Edge partitioning and storage amplification. Memory for a cluster Bytes [the_sum_of_edge_number_and_vertex_number
* 16 + the_number_of_RocksDB_instances
* (write_buffer_size
* max_write_buffer_number
) + rocksdb_block_cache
] * 120% write_buffer_size
and max_write_buffer_number
are RocksDB parameters. For more information, see MemTable. For details about rocksdb_block_cache
, see Memory usage in RocksDB. Number of partitions for a graph space - the_number_of_disks_in_the_cluster
* disk_partition_num_multiplier
disk_partition_num_multiplier
is an integer between 2 and 20 (both including). Its value depends on the disk performance. Use 20 for SSD and 2 for HDD. Answer: On one hand, the data in one single replica takes up about 2.5 times more space than that of the original data file (csv) according to test values. On the other hand, indexes take up additional space. Each indexed vertex or edge takes up 16 bytes of memory. The hard disk space occupied by the index can be empirically estimated as the total number of indexed vertices or edges * 50 bytes.
Answer: The extra 20% is for buffer.
Question 3: How to get the number of RocksDB instances?
Answer: Each graph space corresponds to one RocksDB instance and each directory in the --data_path
item in the etc/nebula-storaged.conf
file corresponds to one RocksDB instance. That is, the number of RocksDB instances = the number of directories * the number of graph spaces.
Note
Users can decrease the memory size occupied by the bloom filter by adding --enable_partitioned_index_filter=true
in etc/nebula-storaged.conf
. But it may decrease the read performance in some random-seek cases.
Caution
Each RocksDB instance takes up about 70M of disk space even when no data has been written yet. One partition corresponds to one RocksDB instance, and when the partition setting is very large, for example, 100, the graph space takes up a lot of disk space after it is created.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/","title":"Uninstall NebulaGraph","text":"This topic describes how to uninstall NebulaGraph.
Caution
Before re-installing NebulaGraph on a machine, follow this topic to completely uninstall the old NebulaGraph, in case the remaining data interferes with the new services, including inconsistencies between Meta services.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#prerequisite","title":"Prerequisite","text":"The NebulaGraph services should be stopped before the uninstallation. For more information, see Manage NebulaGraph services.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#step_1_delete_data_files_of_the_storage_and_meta_services","title":"Step 1: Delete data files of the Storage and Meta Services","text":"If you have modified the data_path
in the configuration files for the Meta Service and Storage Service, the directories where NebulaGraph stores data may not be in the installation path of NebulaGraph. Check the configuration files to confirm the data paths, and then manually delete the directories to clear all data.
Note
For a NebulaGraph cluster, delete the data files of all Storage and Meta servers.
Check the Storage Service disk settings. For example:
########## Disk ##########\n# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/\n# One path per Rocksdb instance.\n--data_path=/nebula/data/storage\n
Check the Metad Service configurations and find the corresponding metadata directories.
Delete the data and the directories found in step 2.
Note
Delete all installation directories, including the cluster.id
file in them.
The default installation path is /usr/local/nebula
, which is specified by --prefix
while installing NebulaGraph.
Find the installation directories of NebulaGraph, and delete them all.
"},{"location":"4.deployment-and-installation/4.uninstall-nebula-graph/#uninstall_nebulagraph_deployed_with_rpm_packages","title":"Uninstall NebulaGraph deployed with RPM packages","text":"Run the following command to get the NebulaGraph version.
$ rpm -qa | grep \"nebula\"\n
The return message is as follows.
nebula-graph-master-1.x86_64\n
Run the following command to uninstall NebulaGraph.
sudo rpm -e <nebula_version>\n
For example:
sudo rpm -e nebula-graph-master-1.x86_64\n
Delete the installation directories.
Run the following command to get the NebulaGraph version.
$ dpkg -l | grep \"nebula\"\n
The return message is as follows.
ii nebula-graph master amd64 NebulaGraph Package built using CMake\n
Run the following command to uninstall NebulaGraph.
sudo dpkg -r <nebula_version>\n
For example:
sudo dpkg -r nebula-graph\n
Delete the installation directories.
In the nebula-docker-compose
directory, run the following command to stop the NebulaGraph services.
docker-compose down -v\n
Delete the nebula-docker-compose
directory.
This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"4.deployment-and-installation/connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
NebulaGraph supports managing services with scripts.
"},{"location":"4.deployment-and-installation/manage-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"4.deployment-and-installation/manage-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"4.deployment-and-installation/manage-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"4.deployment-and-installation/manage-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"4.deployment-and-installation/manage-storage-host/","title":"Manage Storage hosts","text":"Starting from NebulaGraph 3.0.0, setting Storage hosts in the configuration files only registers the hosts on the Meta side, but does not add them into the cluster. You must run the ADD HOSTS
statement to add the Storage hosts.
Note
NebulaGraph Cloud clusters add Storage hosts automatically. Cloud users do not need to manually run ADD HOSTS
.
Add the Storage hosts to a NebulaGraph cluster.
nebula> ADD HOSTS <ip>:<port> [,<ip>:<port> ...];\nnebula> ADD HOSTS \"<hostname>\":<port> [,\"<hostname>\":<port> ...];\n
Note
SHOW HOSTS
to check whether the host is online.127.0.0.1:9779
.ADD HOSTS \"foo-bar\":9779
.Delete the Storage hosts from cluster.
Note
You can not delete an in-use Storage host directly. Delete the associated graph space before deleting the Storage host.
nebula> DROP HOSTS <ip>:<port> [,<ip>:<port> ...];\nnebula> DROP HOSTS \"<hostname>\":<port> [,\"<hostname>\":<port> ...];\n
"},{"location":"4.deployment-and-installation/manage-storage-host/#view_storage_hosts","title":"View Storage hosts","text":"View the Storage hosts in the cluster.
nebula> SHOW HOSTS STORAGE;\n+-------------+------+----------+-----------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-------------+------+----------+-----------+--------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | \"STORAGE\" | \"3ba41bd\" | \"master\" |\n+-------------+------+----------+-----------+--------------+---------+\n
"},{"location":"4.deployment-and-installation/standalone-deployment/","title":"Standalone NebulaGraph","text":"Standalone NebulaGraph merges the Meta, Storage, and Graph services into a single process deployed on a single machine. This topic introduces scenarios, deployment steps, etc. of standalone NebulaGraph.
Danger
Do not use standalone NebulaGraph in production environments.
"},{"location":"4.deployment-and-installation/standalone-deployment/#background","title":"Background","text":"The traditional NebulaGraph consists of three services, each service having executable binary files and the corresponding process. Processes communicate with each other by RPC. In standalone NebulaGraph, the three processes corresponding to the three services are combined into one process. For more information about NebulaGraph, see Architecture overview.
"},{"location":"4.deployment-and-installation/standalone-deployment/#scenarios","title":"Scenarios","text":"Small data sizes and low availability requirements. For example, test environments that are limited by the number of machines, scenarios that are only used to verify functionality.
"},{"location":"4.deployment-and-installation/standalone-deployment/#limitations","title":"Limitations","text":"For information about the resource requirements for standalone NebulaGraph, see Software requirements for compiling NebulaGraph.
"},{"location":"4.deployment-and-installation/standalone-deployment/#steps","title":"Steps","text":"Currently, you can only install standalone NebulaGraph with the source code. The steps are similar to those of the multi-process NebulaGraph. You only need to modify the step Generate Makefile with CMake by adding -DENABLE_STANDALONE_VERSION=on
to the command. For example:
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DENABLE_STANDALONE_VERSION=on -DCMAKE_BUILD_TYPE=Release .. \n
For more information about installation details, see Install NebulaGraph by compiling the source code.
After installing standalone NebulaGraph, see the topic connect to Service to connect to NebulaGraph databases.
"},{"location":"4.deployment-and-installation/standalone-deployment/#configuration_file","title":"Configuration file","text":"The path to the configuration file for standalone NebulaGraph is /usr/local/nebula/etc
by default.
You can run sudo cat nebula-standalone.conf.default
to see the file content. The parameters and the corresponding descriptions in the file are generally the same as the configurations for multi-process NebulaGraph except for the following parameters.
meta_port
9559
The port number of the Meta service. storage_port
9779
The port number of the Storage Service. meta_data_path
data/meta
The path to Meta data. You can run commands to check configurable parameters and the corresponding descriptions. For details, see Configurations.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/","title":"Install NebulaGraph by compiling the source code","text":"Installing NebulaGraph from the source code allows you to customize the compiling and installation settings and test the latest features.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#prerequisites","title":"Prerequisites","text":"Users have to prepare correct resources described in Prepare resources for compiling, installing, and running NebulaGraph.
Note
Compilation of NebulaGraph offline is not currently supported.
Use Git to clone the source code of NebulaGraph to the host.
[Recommended] To install NebulaGraph master, run the following command.
$ git clone --branch release-3.6 https://github.com/vesoft-inc/nebula.git\n
To install the latest developing release, run the following command to clone the source code from the master branch.
$ git clone https://github.com/vesoft-inc/nebula.git\n
Go to the nebula/third-party
directory, and run the install-third-party.sh
script to install the third-party libraries.
$ cd nebula/third-party\n$ ./install-third-party.sh\n
Go back to the nebula
directory, create a directory named build
, and enter the directory.
$ cd ..\n$ mkdir build && cd build\n
Generate Makefile with CMake.
Note
The installation path is /usr/local/nebula
by default. To customize it, add the -DCMAKE_INSTALL_PREFIX=<installation_path>
CMake variable in the following command.
For more information about CMake variables, see CMake variables.
$ cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DCMAKE_BUILD_TYPE=Release ..\n
Compile NebulaGraph.
Note
Check Prepare resources for compiling, installing, and running NebulaGraph.
To speed up the compiling, use the -j
option to set a concurrent number N
. It should be \\(\\min(\\text{CPU core number},\\frac{\\text{the memory size(GB)}}{2})\\).
$ make -j{N} # E.g., make -j2\n
Install NebulaGraph.
$ sudo make install\n
Note
The configuration files in the etc/
directory (/usr/local/nebula/etc
by default) are references. Users can create their own configuration files accordingly. If you want to use the scripts in the script
directory to start, stop, restart, and kill the service, and check the service status, the configuration files have to be named as nebula-graph.conf
, nebula-metad.conf
, and nebula-storaged.conf
.
The source code of the master branch changes frequently. If the corresponding NebulaGraph release is installed, update it in the following steps.
In the nebula
directory, run git pull upstream master
to update the source code.
In the nebula/build
directory, run make -j{N}
and make install
again.
Manage NebulaGraph services
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#cmake_variables","title":"CMake variables","text":""},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#usage_of_cmake_variables","title":"Usage of CMake variables","text":"$ cmake -D<variable>=<value> ...\n
The following CMake variables can be used at the configure (cmake) stage to adjust the compiling settings.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#cmake_install_prefix","title":"CMAKE_INSTALL_PREFIX","text":"CMAKE_INSTALL_PREFIX
specifies the path where the service modules, scripts, configuration files are installed. The default path is /usr/local/nebula
.
ENABLE_WERROR
is ON
by default and it makes all warnings into errors. You can set it to OFF
if needed.
ENABLE_TESTING
is ON
by default and unit tests are built with the NebulaGraph services. If you just need the service modules, set it to OFF
.
ENABLE_ASAN
is OFF
by default and the building of ASan (AddressSanitizer), a memory error detector, is disabled. To enable it, set ENABLE_ASAN
to ON
. This variable is intended for NebulaGraph developers.
NebulaGraph supports the following building types of MAKE_BUILD_TYPE
:
Debug
The default value of CMAKE_BUILD_TYPE
. It indicates building NebulaGraph with the debug info but not the optimization options.
Release
It indicates building NebulaGraph with the optimization options but not the debug info.
RelWithDebInfo
It indicates building NebulaGraph with the optimization options and the debug info.
MinSizeRel
It indicates building NebulaGraph with the optimization options for controlling the code size but not the debug info.
ENABLE_INCLUDE_WHAT_YOU_USE
is OFF
by default. When set to ON
and include-what-you-use is installed on the system, the system reports redundant headers contained in the project source code during makefile generation.
Specifies the program linker on the system. The available values are:
bfd
, the default value, indicates that ld.bfd is applied as the linker.lld
, indicates that ld.lld, if installed on the system, is applied as the linker.gold
, indicates that ld.gold, if installed on the system, is applied as the linker.Usually, CMake locates and uses a C/C++ compiler installed in the host automatically. But if your compiler is not installed at the standard path, or if you want to use a different one, run the command as follows to specify the installation path of the target compiler:
$ cmake -DCMAKE_C_COMPILER=<path_to_gcc/bin/gcc> -DCMAKE_CXX_COMPILER=<path_to_gcc/bin/g++> ..\n$ cmake -DCMAKE_C_COMPILER=<path_to_clang/bin/clang> -DCMAKE_CXX_COMPILER=<path_to_clang/bin/clang++> ..\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/1.install-nebula-graph-by-compiling-the-source-code/#enable_ccache","title":"ENABLE_CCACHE","text":"ENABLE_CCACHE
is ON
by default and Ccache (compiler cache) is used to speed up the compiling of NebulaGraph.
To disable ccache
, setting ENABLE_CCACHE
to OFF
is not enough. On some platforms, the ccache
installation hooks up or precedes the compiler. In such a case, you have to set an environment variable export CCACHE_DISABLE=true
or add a line disable=true
in ~/.ccache/ccache.conf
as well. For more information, see the ccache official documentation.
NEBULA_THIRDPARTY_ROOT
specifies the path where the third party software is installed. By default it is /opt/vesoft/third-party
.
If the compiling fails, we suggest you:
Check whether the operating system release meets the requirements and whether the memory and hard disk space are sufficient.
Check whether the third-party is installed correctly.
Use make -j1
to reduce the compiling concurrency.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/2.install-nebula-graph-by-rpm-or-deb/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
Using Docker Compose can quickly deploy NebulaGraph services based on the prepared configuration file. It is only recommended to use this method when testing functions of NebulaGraph.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#prerequisites","title":"Prerequisites","text":"You have installed the following applications on your host.
Application Recommended version Official installation reference Docker Latest Install Docker Engine Docker Compose Latest Install Docker Compose Git Latest Download Gitnebula-docker-compose/data
directory.Clone the 3.6.0
branch of the nebula-docker-compose
repository to your host with Git.
Danger
The master
branch contains the untested code for the latest NebulaGraph development release. DO NOT use this release in a production environment.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-docker-compose.git\n
Note
The x.y
version of Docker Compose aligns to the x.y
version of NebulaGraph. For the NebulaGraph z
version, Docker Compose does not publish the corresponding z
version, but pulls the z
version of the NebulaGraph image.
Go to the nebula-docker-compose
directory.
$ cd nebula-docker-compose/\n
Run the following command to start all the NebulaGraph services.
Note
[nebula-docker-compose]$ docker-compose up -d\nCreating nebula-docker-compose_metad0_1 ... done\nCreating nebula-docker-compose_metad2_1 ... done\nCreating nebula-docker-compose_metad1_1 ... done\nCreating nebula-docker-compose_graphd2_1 ... done\nCreating nebula-docker-compose_graphd_1 ... done\nCreating nebula-docker-compose_graphd1_1 ... done\nCreating nebula-docker-compose_storaged0_1 ... done\nCreating nebula-docker-compose_storaged2_1 ... done\nCreating nebula-docker-compose_storaged1_1 ... done\n
Compatibility
Starting from NebulaGraph version 3.1.0, nebula-docker-compose automatically starts a NebulaGraph Console docker container and adds the storage host to the cluster (i.e. ADD HOSTS
command).
Note
For more information of the preceding services, see NebulaGraph architecture.
There are two ways to connect to NebulaGraph:
9669
in the container's configuration file, you can connect directly through the default port. For details, see Connect to NebulaGraph.Run the following command to view the name of NebulaGraph Console docker container.
$ docker-compose ps\n Name Command State Ports \n-----------------------------------------------------------------------------------------------------------------------------------------------------------\nnebula-docker-compose_console_1 sh -c for i in `seq 1 60`; ... Up \nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up (healthy) 0.0.0.0:32847->15669/tcp,:::32847->15669/tcp, 19669/tcp, \n 0.0.0.0:32846->19670/tcp,:::32846->19670/tcp, \n 0.0.0.0:32849->5669/tcp,:::32849->5669/tcp, 9669/tcp \n......\n
Note
nebula-docker-compose_console_1
and nebula-docker-compose_graphd1_1
are the container names of NebulaGraph Console and Graph Service respectively.
Run the following command to enter the NebulaGraph Console docker container.
docker exec -it nebula-docker-compose_console_1 /bin/sh\n/ #\n
Connect to NebulaGraph with NebulaGraph Console.
/ # ./usr/local/bin/nebula-console -u <user_name> -p <password> --address=graphd --port=9669\n
Note
By default, the authentication is off, you can only log in with an existing username (the default is root
) and any password. To turn it on, see Enable authentication.
Run the following commands to view the cluster state.
nebula> SHOW HOSTS;\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"storaged0\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged1\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"storaged2\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+-------------+------+----------+--------------+----------------------+------------------------+---------+\n
Run exit
twice to switch back to your terminal (shell).
Run docker-compose ps
to list all the services of NebulaGraph and their status and ports.
Note
NebulaGraph provides services to the clients through port 9669
by default. To use other ports, modify the docker-compose.yaml
file in the nebula-docker-compose
directory and restart the NebulaGraph services.
$ docker-compose ps\nnebula-docker-compose_console_1 sh -c sleep 3 && Up\n nebula-co ...\nnebula-docker-compose_graphd1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49174->19669/tcp,:::49174->19669/tcp, 0.0.0.0:49171->19670/tcp,:::49171->19670/tcp, 0.0.0.0:49177->9669/tcp,:::49177->9669/tcp\nnebula-docker-compose_graphd2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49175->19669/tcp,:::49175->19669/tcp, 0.0.0.0:49172->19670/tcp,:::49172->19670/tcp, 0.0.0.0:49178->9669/tcp,:::49178->9669/tcp\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49180->19669/tcp,:::49180->19669/tcp, 0.0.0.0:49179->19670/tcp,:::49179->19670/tcp, 0.0.0.0:9669->9669/tcp,:::9669->9669/tcp\nnebula-docker-compose_metad0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49157->19559/tcp,:::49157->19559/tcp, 0.0.0.0:49154->19560/tcp,:::49154->19560/tcp, 0.0.0.0:49160->9559/tcp,:::49160->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49156->19559/tcp,:::49156->19559/tcp, 0.0.0.0:49153->19560/tcp,:::49153->19560/tcp, 0.0.0.0:49159->9559/tcp,:::49159->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49158->19559/tcp,:::49158->19559/tcp, 0.0.0.0:49155->19560/tcp,:::49155->19560/tcp, 0.0.0.0:49161->9559/tcp,:::49161->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49166->19779/tcp,:::49166->19779/tcp, 0.0.0.0:49163->19780/tcp,:::49163->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49169->9779/tcp,:::49169->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49165->19779/tcp,:::49165->19779/tcp, 0.0.0.0:49162->19780/tcp,:::49162->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49168->9779/tcp,:::49168->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 /usr/local/nebula/bin/nebu ... Up 0.0.0.0:49167->19779/tcp,:::49167->19779/tcp, 0.0.0.0:49164->19780/tcp,:::49164->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:49170->9779/tcp,:::49170->9779/tcp, 9780/tcp\n
If the service is abnormal, you can first confirm the abnormal container name (such as nebula-docker-compose_graphd2_1
) and then log in to the container and troubleshoot.
$ docker exec -it nebula-docker-compose_graphd2_1 bash\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#check_the_service_data_and_logs","title":"Check the service data and logs","text":"All the data and logs of NebulaGraph are stored persistently in the nebula-docker-compose/data
and nebula-docker-compose/logs
directories.
The structure of the directories is as follows:
nebula-docker-compose/\n |-- docker-compose.yaml\n \u251c\u2500\u2500 data\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 meta2\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage0\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 storage1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 storage2\n \u2514\u2500\u2500 logs\n \u251c\u2500\u2500 graph\n \u251c\u2500\u2500 graph1\n \u251c\u2500\u2500 graph2\n \u251c\u2500\u2500 meta0\n \u251c\u2500\u2500 meta1\n \u251c\u2500\u2500 meta2\n \u251c\u2500\u2500 storage0\n \u251c\u2500\u2500 storage1\n \u2514\u2500\u2500 storage2\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#modify_configurations","title":"Modify configurations","text":"The configuration file of Docker Compose is nebula-docker-compose/docker-compose.yaml
. To make the new configuration take effect, modify the configuration in this file and restart the service.
The configurations in the docker-compose.yaml
file overwrite the configurations in the configuration file (/usr/local/nebula/etc
) of the containered NebulaGraph service. Therefore, you can modify the configurations in the docker-compose.yaml
file to customize the configurations of the NebulaGraph service.
For more instructions, see Configurations.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#restart_nebulagraph_services","title":"Restart NebulaGraph services","text":"To restart all the NebulaGraph services, run the following command:
$ docker-compose restart\nRestarting nebula-docker-compose_console_1 ... done\nRestarting nebula-docker-compose_graphd_1 ... done\nRestarting nebula-docker-compose_graphd1_1 ... done\nRestarting nebula-docker-compose_graphd2_1 ... done\nRestarting nebula-docker-compose_storaged1_1 ... done\nRestarting nebula-docker-compose-storaged0_1 ... done\nRestarting nebula-docker-compose_storaged2_1 ... done\nRestarting nebula-docker-compose_metad1_1 ... done\nRestarting nebula-docker-compose_metad2_1 ... done\nRestarting nebula-docker-compose_metad0_1 ... done\n
To restart multiple services, such as graphd and storaged0, run the following command:
$ docker-compose restart graphd storaged0\nRestarting nebula-docker-compose_graphd_1 ... done\nRestarting nebula-docker-compose_storaged0_1 ... done\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#stop_and_remove_nebulagraph_services","title":"Stop and remove NebulaGraph services","text":"You can stop and remove all the NebulaGraph services by running the following command:
Danger
This command stops and removes all the containers of the NebulaGraph services and the related network. If you define volumes in the docker-compose.yaml
, the related data are retained.
The command docker-compose down -v
removes all the local data. Try this command if you are using the nightly release and having some compatibility issues.
$ docker-compose down\n
The following information indicates you have successfully stopped the NebulaGraph services:
Stopping nebula-docker-compose_console_1 ... done\nStopping nebula-docker-compose_graphd1_1 ... done\nStopping nebula-docker-compose_graphd_1 ... done\nStopping nebula-docker-compose_graphd2_1 ... done\nStopping nebula-docker-compose_storaged1_1 ... done\nStopping nebula-docker-compose_storaged0_1 ... done\nStopping nebula-docker-compose_storaged2_1 ... done\nStopping nebula-docker-compose_metad2_1 ... done\nStopping nebula-docker-compose_metad0_1 ... done\nStopping nebula-docker-compose_metad1_1 ... done\nRemoving nebula-docker-compose_console_1 ... done\nRemoving nebula-docker-compose_graphd1_1 ... done\nRemoving nebula-docker-compose_graphd_1 ... done\nRemoving nebula-docker-compose_graphd2_1 ... done\nRemoving nebula-docker-compose_storaged1_1 ... done\nRemoving nebula-docker-compose_storaged0_1 ... done\nRemoving nebula-docker-compose_storaged2_1 ... done\nRemoving nebula-docker-compose_metad2_1 ... done\nRemoving nebula-docker-compose_metad0_1 ... done\nRemoving nebula-docker-compose_metad1_1 ... done\nRemoving network nebula-docker-compose_nebula-net\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#faq","title":"FAQ","text":""},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#how_to_fix_the_docker_mapping_to_external_ports","title":"How to fix the docker mapping to external ports?","text":"To set the ports
of corresponding services as fixed mapping, modify the docker-compose.yaml
in the nebula-docker-compose
directory. For example:
graphd:\n image: vesoft/nebula-graphd:release-3.6\n ...\n ports:\n - 9669:9669\n - 19669\n - 19670\n
9669:9669
indicates the internal port 9669 is uniformly mapped to external ports, while 19669
indicates the internal port 19669 is randomly mapped to external ports.
In the nebula-docker-compose/docker-compose.yaml
file, change all the image
values to the required image version.
In the nebula-docker-compose
directory, run docker-compose pull
to update the images of the Graph Service, Storage Service, Meta Service, and NebulaGraph Console.
Run docker-compose up -d
to start the NebulaGraph services again.
After connecting to NebulaGraph with NebulaGraph Console, run SHOW HOSTS GRAPH
, SHOW HOSTS STORAGE
, or SHOW HOSTS META
to check the version of the responding service respectively.
ERROR: toomanyrequests
when docker-compose pull
","text":"You may meet the following error.
ERROR: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
.
You have met the rate limit of Docker Hub. Learn more on Understanding Docker Hub Rate Limiting.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#how_to_update_the_nebulagraph_console_client","title":"How to update the NebulaGraph Console client?","text":"The command docker-compose pull
updates both the NebulaGraph services and the NebulaGraph Console.
offline
status?","text":"The activation script for storaged containers in Docker Compose may fail to run in rare cases. You can connect to NebulaGraph with NebulaGraph Console or NebulaGraph Studio and then manually run the ADD HOSTS
command to activate them by adding the storaged containers to the cluster. An example of the command is as follows:
nebula> ADD HOSTS \"storaged0\":9779,\"storaged1\":9779,\"storaged2\":9779\n
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/3.deploy-nebula-graph-with-docker-compose/#related_documents","title":"Related documents","text":"You can install NebulaGraph by downloading the tar.gz file.
Note
Download the NebulaGraph tar.gz file using the following address.
Before downloading, you need to replace <release_version>
with the version you want to download.
//Centos 7\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.tar.gz.sha256sum.txt\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.tar.gz.sha256sum.txt\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.tar.gz.sha256sum.txt\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.tar.gz.sha256sum.txt\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.tar.gz\n//Checksum\nhttps://oss-cdn.nebula-graph.com.cn/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.tar.gz.sha256sum.txt\n
For example, to download the NebulaGraph release-3.6 tar.gz file for CentOS 7.5
, run the following command:
wget https://oss-cdn.nebula-graph.com.cn/package/master/nebula-graph-master.el7.x86_64.tar.gz\n
Decompress the tar.gz file to the NebulaGraph installation directory.
tar -xvzf <tar.gz_file_name> -C <install_path>\n
tar.gz_file_name
specifies the name of the tar.gz file.install_path
specifies the installation path.For example:
tar -xvzf nebula-graph-master.el7.x86_64.tar.gz -C /home/joe/nebula/install\n
Modify the name of the configuration file.
Enter the decompressed directory, rename the files nebula-graphd.conf.default
, nebula-metad.conf.default
, and nebula-storaged.conf.default
in the subdirectory etc
, and delete .default
to apply the default configuration of NebulaGraph.
Note
To modify the configuration, see Configurations.
So far, you have installed NebulaGraph successfully.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/4.install-nebula-graph-from-tar/#next_to_do","title":"Next to do","text":"Manage NebulaGraph services
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/6.deploy-nebula-graph-with-peripherals/","title":"Install NebulaGraph with ecosystem tools","text":"You can install the NebulaGraph Community Edition with the following ecosystem tools:
NebulaGraph's source code is written in C++. Compiling NebulaGraph requires certain dependencies which might conflict with host system dependencies, potentially causing compilation failures. Docker offers a solution to this. NebulaGraph provides a Docker image containing the complete compilation environment, ensuring an efficient build process and avoiding host OS conflicts. This guide outlines the steps to compile NebulaGraph using Docker.
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/7.compile-using-docker/#prerequisites","title":"Prerequisites","text":"Before you begin:
Docker: Ensure Docker is installed on your system.
Clone NebulaGraph's Source Code: Clone the repository locally using:
git clone --branch release-3.6 https://github.com/vesoft-inc/nebula.git\n
This clones the NebulaGraph source code to a subdirectory named nebula
.
Pull the NebulaGraph compilation image.
docker pull vesoft/nebula-dev:ubuntu2004\n
Here, we use the official NebulaGraph compilation image, ubuntu2004
. For different versions, see nebula-dev-docker.
Start the compilation container.
docker run -ti \\\n --security-opt seccomp=unconfined \\\n -v \"$PWD\":/home \\\n -w /home \\\n --name nebula_dev \\\n vesoft/nebula-dev:ubuntu2004 \\\n bash\n
--security-opt seccomp=unconfined
: Disables the seccomp security mechanism to avoid compilation errors.-v \"$PWD\":/home
: Mounts the local path of the NebulaGraph code to the container's /home
directory.-w /home
: Sets the container's working directory to /home
. Any command run inside the container will use this directory as the current directory.--name nebula_dev
: Assigns a name to the container, making it easier to manage and operate.vesoft/nebula-dev:ubuntu2004
: Uses the ubuntu2004
version of the vesoft/nebula-dev
compilation image.bash
: Executes the bash
command inside the container, entering the container's interactive terminal.After executing this command, you'll enter an interactive terminal inside the container. To re-enter the container, use docker exec -ti nebula_dev bash
.
Compile NebulaGraph inside the container.
Enter the NebulaGraph source code directory.
cd nebula\n
Create a build directory and enter it.
mkdir build && cd build\n
Use CMake to generate the Makefile.
cmake -DCMAKE_CXX_COMPILER=$TOOLSET_CLANG_DIR/bin/g++ -DCMAKE_C_COMPILER=$TOOLSET_CLANG_DIR/bin/gcc -DENABLE_WERROR=OFF -DCMAKE_BUILD_TYPE=Debug -DENABLE_TESTING=OFF ..\n
For more on CMake, see CMake Parameters. Compile NebulaGraph.
# The -j parameter specifies the number of threads to use.\n# If you have a multi-core CPU, you can use more threads to speed up compilation.\nmake -j2\n
Compilation might take some time based on your system performance.
Install the Executables and Libraries.
Post successful compilation, NebulaGraph's binaries and libraries are located in /home/nebula/build
. Install them to /usr/local/nebula
:
make install\n
Once completed, NebulaGraph is compiled and installed in the host directory /usr/local/nebula
.
For now, NebulaGraph does not provide an official deployment tool. Users can deploy a NebulaGraph cluster with RPM or DEB package manually. This topic provides an example of deploying a NebulaGraph cluster on multiple servers (machines).
"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/deploy-nebula-graph-cluster/#deployment","title":"Deployment","text":"Machine name IP address Number of graphd Number of storaged Number of metad A 192.168.10.111 1 1 1 B 192.168.10.112 1 1 1 C 192.168.10.113 1 1 1 D 192.168.10.114 1 1 None E 192.168.10.115 1 1 None"},{"location":"4.deployment-and-installation/2.compile-and-install-nebula-graph/deploy-nebula-graph-cluster/#prerequisites","title":"Prerequisites","text":"Install NebulaGraph on each machine in the cluster. Available approaches of installation are as follows.
To deploy NebulaGraph according to your requirements, you have to modify the configuration files.
All the configuration files for NebulaGraph, including nebula-graphd.conf
, nebula-metad.conf
, and nebula-storaged.conf
, are stored in the etc
directory in the installation path. You only need to modify the configuration for the corresponding service on the machines. The configurations that need to be modified for each machine are as follows.
nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
B nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
C nebula-graphd.conf
, nebula-storaged.conf
, nebula-metad.conf
D nebula-graphd.conf
, nebula-storaged.conf
E nebula-graphd.conf
, nebula-storaged.conf
Users can refer to the content of the following configurations, which only show part of the cluster settings. The hidden content uses the default setting so that users can better understand the relationship between the servers in the NebulaGraph cluster.
Note
The main configuration to be modified is meta_server_addrs
. All configurations need to fill in the IP addresses and ports of all Meta services. At the same time, local_ip
needs to be modified as the network IP address of the machine itself. For detailed descriptions of the configuration parameters, see:
Deploy machine A
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.111\n# Meta daemon listening port\n--port=9559\n
Deploy machine B
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.112\n# Meta daemon listening port\n--port=9559\n
Deploy machine C
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Storage daemon listening port\n--port=9779\n
nebula-metad.conf
########## networking ##########\n# Comma separated Meta Server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-metad process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.113\n# Meta daemon listening port\n--port=9559\n
Deploy machine D
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.114\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.114\n# Storage daemon listening port\n--port=9779\n
Deploy machine E
nebula-graphd.conf
########## networking ##########\n# Comma separated Meta Server Addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-graphd process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.115\n# Network device to listen on\n--listen_netdev=any\n# Port to listen on\n--port=9669\n
nebula-storaged.conf
########## networking ##########\n# Comma separated Meta server addresses\n--meta_server_addrs=192.168.10.111:9559,192.168.10.112:9559,192.168.10.113:9559\n# Local IP used to identify the nebula-storaged process.\n# Change it to an address other than loopback if the service is distributed or\n# will be accessed remotely.\n--local_ip=192.168.10.115\n# Storage daemon listening port\n--port=9779\n
Start the corresponding service on each machine. Descriptions are as follows.
Machine name The process to be started A graphd, storaged, metad B graphd, storaged, metad C graphd, storaged, metad D graphd, storaged E graphd, storagedThe command to start the NebulaGraph services is as follows.
sudo /usr/local/nebula/scripts/nebula.service start <metad|graphd|storaged|all>\n
Note
all
instead./usr/local/nebula
is the default installation path for NebulaGraph. Use the actual path if you have customized the path. For more information about how to start and stop the services, see Manage NebulaGraph services.Install the native CLI client NebulaGraph Console, then connect to any machine that has started the graphd process, run ADD HOSTS
command to add storage hosts, and run SHOW HOSTS
to check the cluster status. For example:
$ ./nebula-console --addr 192.168.10.111 --port 9669 -u root -p nebula\n\n2021/05/25 01:41:19 [INFO] connection pool is initialized successfully\nWelcome to NebulaGraph!\n\n> ADD HOSTS 192.168.10.111:9779, 192.168.10.112:9779, 192.168.10.113:9779, 192.168.10.114:9779, 192.168.10.115:9779;\n> SHOW HOSTS;\n+------------------+------+----------+--------------+----------------------+------------------------+---------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+----------------------+------------------------+---------+\n| \"192.168.10.111\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.112\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.113\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.114\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n| \"192.168.10.115\" | 9779 | \"ONLINE\" | 0 | \"No valid partition\" | \"No valid partition\" | \"master\" |\n+------------------+------+-----------+----------+--------------+----------------------+------------------------+---------+\n
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/","title":"Upgrade NebulaGraph to master","text":"This topic describes how to upgrade NebulaGraph from version 2.x and 3.x to master, taking upgrading from version 2.6.1 to master as an example.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#applicable_source_versions","title":"Applicable source versions","text":"This topic applies to upgrading NebulaGraph from 2.5.0 and later 2.x, and 3.x versions to master. It does not apply to historical versions earlier than 2.5.0, including the 1.x versions.
To upgrade NebulaGraph from historical versions to master:
Caution
To upgrade NebulaGraph from versions earlier than 2.0.0 (including the 1.x versions) to master, you need to find the date_time_zonespec.csv
in the share/resources
directory of master files, and then copy it to the same directory in the NebulaGraph installation path.
Client compatibility
After the upgrade, you will not be able to connect to NebulaGraph from old clients. You will need to upgrade all clients to a version compatible with NebulaGraph master.
Configuration changes
A few configuration parameters have been changed. For more information, see the release notes and configuration docs.
nGQL compatibility
The nGQL syntax is partially incompatible:
YIELD
clause to return custom variables.YIELD
clause is required in the FETCH
, GO
, LOOKUP
, FIND PATH
and GET SUBGRAPH
statements.MATCH
statement. For example, from return v.name
to return v.player.name
.Full-text indexes
Before upgrading a NebulaGraph cluster with full-text indexes deployed, you must manually delete the full-text indexes in Elasticsearch, and then run the SIGN IN
command to log into ES and recreate the indexes after the upgrade is complete. To manually delete the full-text indexes in Elasticsearch, you can use the curl command curl -XDELETE -u <es_username>:<es_password> '<es_access_ip>:<port>/<fullindex_name>'
, for example, curl -XDELETE -u elastic:elastic 'http://192.168.8.xxx:9200/nebula_index_2534'
. If no username and password are set for Elasticsearch, you can omit the -u <es_username>:<es_password>
part.
Caution
There may be other undiscovered influences. Before the upgrade, we recommend that you read the release notes and user manual carefully, and keep an eye on the posts on the forum and issues on Github.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#preparations_before_the_upgrade","title":"Preparations before the upgrade","text":"Download the package of NebulaGraph master according to your operating system and system architecture. You need the binary files during the upgrade. Find the package on the download page.
Note
You can also get the new binaries from the source code or the RPM/DEB package.
Locate the data files based on the value of the data_path
parameters in the Storage and Meta configurations, and backup the data files. The default paths are nebula/data/storage
and nebula/data/meta
.
Danger
The old data will not be automatically backed up during the upgrade. You must manually back up the data to avoid data loss.
Collect the statistics of all graph spaces before the upgrade. After the upgrade, you can collect again and compare the results to make sure that no data is lost. To collect the statistics:
SUBMIT JOB STATS
.SHOW JOBS
and record the result.Stop all NebulaGraph services.
<nebula_install_path>/scripts/nebula.service stop all\n
nebula_install_path
indicates the installation path of NebulaGraph.
The storaged progress needs around 1 minute to flush data. You can run nebula.service status all
to check if all services are stopped. For more information about starting and stopping services, see Manage services.
Note
If the services are not fully stopped in 20 minutes, stop upgrading and ask for help on the forum or Github.
Caution
Starting from version 3.0.0, it is possible to insert vertices without tags. If you need to keep vertices without tags, add --graph_use_vertex_key=true
in the configuration file (nebula-graphd.conf
) of all Graph services within the cluster; and add --use_vertex_key=true
in the configuration file (nebula-storaged.conf
) of all Storage services.\"
In the target path where you unpacked the package, use the binaries in the bin
directory to replace the old binaries in the bin
directory in the NebulaGraph installation path.
Note
Update the binary of the corresponding service on each NebulaGraph server.
Modify the following parameters in all Graph configuration files to accommodate the value range of the new version. If the parameter values are within the specified range, skip this step.
session_idle_timeout_secs
. The recommended value is 28800.client_idle_timeout_secs
. The recommended value is 28800.The default values of these parameters in the 2.x versions are not within the range of the new version. If you do not change the default values, the upgrade will fail. For detailed parameter description, see Graph Service Configuration.
Start all Meta services.
<nebula_install_path>/scripts/nebula-metad.service start\n
Once started, the Meta services take several seconds to elect a leader.
To verify that Meta services are all started, you can start any Graph server, connect to it through NebulaGraph Console, and run SHOW HOSTS meta
and SHOW META LEADER
. If the status of Meta services are correctly returned, the services are successfully started.
Note
If the operation fails, stop the upgrade and ask for help on the forum or GitHub.
Start all the Graph and Storage services.
Note
If the operation fails, stop the upgrade and ask for help on the forum or GitHub.
Connect to the new version of NebulaGraph to verify that services are available and data are complete. For how to connect, see Connect to NebulaGraph.
Currently, there is no official way to check whether the upgrade is successful. You can run the following reference statements to test the upgrade:
nebula> SHOW HOSTS;\nnebula> SHOW HOSTS storage;\nnebula> SHOW SPACES;\nnebula> USE <space_name>\nnebula> SHOW PARTS;\nnebula> SUBMIT JOB STATS;\nnebula> SHOW STATS;\nnebula> MATCH (v) RETURN v LIMIT 5;\n
You can also test against new features in version master.
If the upgrade fails, stop all NebulaGraph services of the new version, recover the old configuration files and binaries, and start the services of the old version.
All NebulaGraph clients in use must be switched to the old version.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#faq","title":"FAQ","text":""},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#can_i_write_through_the_client_during_the_upgrade","title":"Can I write through the client during the upgrade?","text":"A: No. You must stop all NebulaGraph services during the upgrade.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#the_space_0_not_found_warning_message_during_the_upgrade_process","title":"TheSpace 0 not found
warning message during the upgrade process","text":"When the Space 0 not found
warning message appears during the upgrade process, you can ignore it. The space 0
is used to store meta information about the Storage service and does not contain user data, so it will not affect the upgrade.
A: You only need to update the configuration files and binaries of the Graph Service.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#how_to_resolve_the_error_permission_denied","title":"How to resolve the errorPermission denied
?","text":"A: Try again with the sudo privileges.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#is_there_any_change_in_gflags","title":"Is there any change in gflags?","text":"A: Yes. For more information, see the release notes and configuration docs.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#is_there_a_tool_or_solution_for_verifying_data_consistency_after_the_upgrade","title":"Is there a tool or solution for verifying data consistency after the upgrade?","text":"A: No. But if you only want to check the number of vertices and edges, run SUBMIT JOB STATS
and SHOW STATS
after the upgrade, and compare the result with the result that you recorded before the upgrade.
OFFLINE
and Leader count
is 0
?","text":"A: Run the following statement to add the Storage hosts into the cluster manually.
ADD HOSTS <ip>:<port>[, <ip>:<port> ...];\n
For example:
ADD HOSTS 192.168.10.100:9779, 192.168.10.101:9779, 192.168.10.102:9779;\n
If the issue persists, ask for help on the forum or GitHub.
"},{"location":"4.deployment-and-installation/3.upgrade-nebula-graph/upgrade-nebula-comm/#why_the_job_type_changed_after_the_upgrade_but_job_id_remains_the_same","title":"Why the job type changed after the upgrade, but job ID remains the same?","text":"A: SHOW JOBS
depends on an internal ID to identify job types, but in NebulaGraph 2.5.0 the internal ID changed in this pull request, so this issue happens after upgrading from a version earlier than 2.5.0.
This topic introduces the restrictions for full-text indexes. Please read the restrictions very carefully before using the full-text indexes.
Caution
The full-text index feature has been redone in version 3.6.0 and is not compatible with previous versions. If you want to continue to use wildcards, regulars, fuzzy matches, etc., there are 3 ways to do so as follows:
For now, full-text search has the following limitations:
LOOKUP
statements only.LIMIT
clause to return more records, up to 10,000. You can modify the ElasticSearch parameters to adjust the maximum number of records returned.STRING
or FIXED_STRING
.NULL
.Nebula\u00a0Graph full-text indexes are powered by Elasticsearch. This means that you can use Elasticsearch full-text query language to retrieve what you want. Full-text indexes are managed through built-in procedures. They can be created only for variable STRING
and FIXED_STRING
properties when the listener cluster and the Elasticsearch cluster are deployed.
Before you start using the full-text index, please make sure that you know the restrictions.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#deploy_elasticsearch_cluster","title":"Deploy Elasticsearch cluster","text":"To deploy an Elasticsearch cluster, see Kubernetes Elasticsearch deployment or Elasticsearch installation.
Note
To support external network access to Elasticsearch, set network.host
to 0.0.0.0
in config/elasticsearch.yml
.
You can configure the Elasticsearch to meet your business needs. To customize the Elasticsearch, see Elasticsearch Document.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#sign_in_to_the_text_search_clients","title":"Sign in to the text search clients","text":"When the Elasticsearch cluster is deployed, use the SIGN IN
statement to sign in to the Elasticsearch clients. Multiple elastic_ip:port
pairs are separated with commas. You must use the IPs and the port number in the configuration file for the Elasticsearch.
SIGN IN TEXT SERVICE (<elastic_ip:port>, {HTTP | HTTPS} [,\"<username>\", \"<password>\"]) [, (<elastic_ip:port>, ...)];\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example","title":"Example","text":"nebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);\n
Note
Elasticsearch does not have a username or password by default. If you configured a username and password, you need to specify them in the SIGN IN
statement.
Caution
The Elasticsearch client can only be logged in once, and if there are changes, you need to SIGN OUT
and then SIGN IN
again, and the client takes effect globally, and multiple graph spaces share the same Elasticsearch client.
The SHOW TEXT SEARCH CLIENTS
statement can list the text search clients.
SHOW TEXT SEARCH CLIENTS;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example_1","title":"Example","text":"nebula> SHOW TEXT SEARCH CLIENTS;\n+-----------------+-----------------+------+\n| Type | Host | Port |\n+-----------------+-----------------+------+\n| \"ELASTICSEARCH\" | \"192.168.8.100\" | 9200 |\n+-----------------+-----------------+------+\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#sign_out_to_the_text_search_clients","title":"Sign out to the text search clients","text":"The SIGN OUT TEXT SERVICE
statement can sign out all the text search clients.
SIGN OUT TEXT SERVICE;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es/#example_2","title":"Example","text":"nebula> SIGN OUT TEXT SERVICE;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/","title":"Deploy Raft Listener for NebulaGraph Storage service","text":"Full-text index data is written to the Elasticsearch cluster asynchronously. The Raft Listener (Listener for short) is a separate process that fetches data from the Storage Service and writes them into the Elasticsearch cluster.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#prerequisites","title":"Prerequisites","text":"The Listener service uses the same binary as the storaged service. However, the configuration files are different and the processes use different ports. You can install NebulaGraph on all servers that need to deploy a Listener, but only the storaged service can be used. For details, see Install NebulaGraph by RPM or DEB Package.
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_2_prepare_the_configuration_file_for_the_listener","title":"Step 2: Prepare the configuration file for the Listener","text":"In the etc
directory, remove the suffix from nebula-storaged-listener.conf.default
or nebula-storaged-listener.conf.production
to nebula-storaged-listener.conf
, and then modify the configuration content.
Most configurations are the same as the configurations of Storage Service. This topic only introduces the differences.
Name Default value Descriptiondaemonize
true
When set to true
, the process is a daemon process. pid_file
pids/nebula-metad-listener.pid
The file that records the process ID. meta_server_addrs
- IP (or hostname) and ports of all Meta services. Multiple Meta services are separated by commas. local_ip
- The local IP (or hostname) of the Listener service. Use real IP addresses instead of domain names or loopback IP addresses such as 127.0.0.1
. port
- The listening port of the RPC daemon of the Listener service. heartbeat_interval_secs
10
The heartbeat interval of the Meta service. The unit is second (s). listener_path
data/listener
The WAL directory of the Listener. Only one directory is allowed. data_path
data
For compatibility reasons, this parameter can be ignored. Fill in the default value data
. part_man_type
memory
The type of the part manager. Optional values \u200b\u200bare memory
and meta
. rocksdb_batch_size
4096
The default reserved bytes for batch operations. rocksdb_block_cache
4
The default block cache size of BlockBasedTable. The unit is Megabyte (MB). engine_type
rocksdb
The type of the Storage engine, such as rocksdb
, memory
, etc. part_type
simple
The type of the part, such as simple
, consensus
, etc."},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_3_start_listeners","title":"Step 3: Start Listeners","text":"To initiate the Listener, navigate to the installation path of the desired cluster and execute the following command:
./bin/nebula-storaged --flagfile etc/nebula-storaged-listener.conf\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#step_4_add_listeners_to_nebulagraph","title":"Step 4: Add Listeners to NebulaGraph","text":"Connect to NebulaGraph and run USE <space>
to enter the graph space that you want to create full-text indexes for. Then run the following statement to add a Listener into NebulaGraph.
ADD LISTENER ELASTICSEARCH <listener_ip:port> [,<listener_ip:port>, ...]\n
Warning
You must use real IPs for a Listener.
Add all Listeners in one statement completely.
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789,192.168.8.101:9789;\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#show_listeners","title":"Show Listeners","text":"Run the SHOW LISTENER
statement to list all Listeners.
nebula> SHOW LISTENER;\n+--------+-----------------+------------------------+-------------+\n| PartId | Type | Host | Host Status |\n+--------+-----------------+------------------------+-------------+\n| 1 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 2 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n| 3 | \"ELASTICSEARCH\" | \"\"192.168.8.100\":9789\" | \"ONLINE\" |\n+--------+-----------------+------------------------+-------------+\n
"},{"location":"4.deployment-and-installation/6.deploy-text-based-index/3.deploy-listener/#remove_listeners","title":"Remove Listeners","text":"Run the REMOVE LISTENER ELASTICSEARCH
statement to remove all Listeners in a graph space.
nebula> REMOVE LISTENER ELASTICSEARCH;\n
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/","title":"Configurations","text":"NebulaGraph builds the configurations based on the gflags repository. Most configurations are flags. When the NebulaGraph service starts, it will get the configuration information from Configuration files by default. Configurations that are not in the file apply the default values.
Note
Legacy version compatibility
In the topic of 1.x, we provide a method of using the CONFIGS
command to modify the configurations in the cache. However, using this method in a production environment can easily cause inconsistencies of configurations between clusters and the local. Therefore, this method will no longer be introduced starting with version 2.x.
Use the following command to get all the configuration information of the service corresponding to the binary file:
<binary> --help\n
For example:
# Get the help information from Meta\n$ /usr/local/nebula/bin/nebula-metad --help\n\n# Get the help information from Graph\n$ /usr/local/nebula/bin/nebula-graphd --help\n\n# Get the help information from Storage\n$ /usr/local/nebula/bin/nebula-storaged --help\n
The above examples use the default storage path /usr/local/nebula/bin/
. If you modify the installation path of NebulaGraph, use the actual path to query the configurations.
Use the curl
command to get the value of the running configurations.
For example:
# Get the running configurations from Meta\ncurl 127.0.0.1:19559/flags\n\n# Get the running configurations from Graph\ncurl 127.0.0.1:19669/flags\n\n# Get the running configurations from Storage\ncurl 127.0.0.1:19779/flags\n
Utilizing the -s
or `-silent option allows for the concealment of the progress bar and error messages. For example:
curl -s 127.0.0.1:19559/flags\n
Note
In an actual environment, use the real IP (or hostname) instead of 127.0.0.1
in the above example.
NebulaGraph provides two initial configuration files for each service, <service_name>.conf.default
and <service_name>.conf.production
. You can use them in different scenarios conveniently. For clusters installed from source and with a RPM/DEB package, the default path is /usr/local/nebula/etc/
. For clusters installed with a TAR package, the path is <install_path>/<tar_package_directory>/etc
.
The configuration values in the initial configuration file are for reference only and can be adjusted according to actual needs. To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
to make it valid.
Note
To ensure the availability of services, it is recommended that configurations for the same service be consistent, except for local_ip
. For example, three Storage servers are deployed in one NebulaGraph cluster. The configurations of the three Storage servers are recommended to be consistent, except for local_ip
.
The initial configuration files corresponding to each service are as follows.
NebulaGraph service Initial configuration file Description Metanebula-metad.conf.default
and nebula-metad.conf.production
Meta service configuration Graph nebula-graphd.conf.default
and nebula-graphd.conf.production
Graph service configuration Storage nebula-storaged.conf.default
and nebula-storaged.conf.production
Storage service configuration Each initial configuration file of all services contains local_config
. The default value is true
, which means that the NebulaGraph service will get configurations from its configuration files and start it.
Caution
It is not recommended to modify the value of local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.
For clusters installed with Docker Compose, the configuration file's default installation path of the cluster is <install_path>/nebula-docker-compose/docker-compose.yaml
. The parameters in the command
field of the file are the launch parameters for each service.
For clusters installed with Kubectl through NebulaGraph Operator, the configuration file's path is the path of the cluster YAML file. You can modify the configuration of each service through the spec.{graphd|storaged|metad}.config
parameter.
Note
The services cannot be configured for clusters installed with Helm.
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/#modify_configurations","title":"Modify configurations","text":"You can modify the configurations of NebulaGraph in the configuration file or use commands to dynamically modify configurations.
Caution
Using both methods to modify the configuration can cause the configuration information to be managed inconsistently, which may result in confusion. It is recommended to only use the configuration file to manage the configuration, or to make the same modifications to the configuration file after dynamically updating the configuration through commands to ensure consistency.
"},{"location":"5.configurations-and-logs/1.configurations/1.configurations/#modifying_configurations_in_the_configuration_file","title":"Modifying configurations in the configuration file","text":"By default, each NebulaGraph service gets configured from its configuration files. You can modify configurations and make them valid according to the following steps:
For clusters installed from source, with a RPM/DEB, or a TAR package
Use a text editor to modify the configuration files of the target service and save the modification.
Choose an appropriate time to restart all NebulaGraph services to make the modifications valid.
For clusters installed with Docker Compose
<install_path>/nebula-docker-compose/docker-compose.yaml
file, modify the configurations of the target service.nebula-docker-compose
directory, run the command docker-compose up -d
to restart the service involving configuration modifications.For clusters installed with Kubectl
For details, see Customize configuration parameters for a NebulaGraph cluster.
You can dynamically modify the configuration of NebulaGraph by using the curl command. For example, to modify the wal_ttl
parameter of the Storage service to 600
, use the following command:
curl -X PUT -H \"Content-Type: application/json\" -d'{\"wal_ttl\":\"600\"}' -s \"http://192.168.15.6:19779/flags\"\n
In this command, {\"wal_ttl\":\"600\"}
specifies the configuration parameter and its value to be modified, and 192.168.15.6:19779
specifies the IP address and HTTP port number of the Storage service.
Caution
local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted.NebulaGraph provides two initial configuration files for the Meta service, nebula-metad.conf.default
and nebula-metad.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-metad.pid
The file that records the process ID. No timezone_name
- Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is UTC+00:00:00
. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC time.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Meta Service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
metad-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
metad-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Meta Service. The local IP address is used to identify the nebula-metad process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No port
9559
Specifies RPC daemon listening port of the Meta service. The neighboring +1
(9560
) port is used for Raft communication between Meta services. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19559
Specifies the port for the HTTP service. No ws_storage_http_port
19779
Specifies the Storage service listening port used by the HTTP protocol. It must be consistent with the ws_http_port
in the Storage service configuration file. This parameter only applies to standalone NebulaGraph. No Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
data_path
data/meta
The storage path for Meta data. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#misc_configurations","title":"Misc configurations","text":"Name Predefined Value Description Whether supports runtime dynamic modifications default_parts_num
10
Specifies the default partition number when creating a new graph space. No default_replica_factor
1
Specifies the default replica number when creating a new graph space. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes agent_heartbeat_interval_secs
60
Specifies the default heartbeat interval for the Agent service. This configuration influences the time it takes for the system to determine that the Agent service is offline. This configuration is measured in seconds. No"},{"location":"5.configurations-and-logs/1.configurations/2.meta-config/#rocksdb_options_configurations","title":"RocksDB options configurations","text":"Name Predefined Value Description Whether supports runtime dynamic modifications rocksdb_wal_sync
true
Enables or disables RocksDB WAL synchronization. Available values are true
(enable) and false
(disable). No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/","title":"Graph Service configuration","text":"NebulaGraph provides two initial configuration files for the Graph Service, nebula-graphd.conf.default
and nebula-graphd.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-graphd.pid
The file that records the process ID. No enable_optimizer
true
When set to true
, the optimizer is enabled. No timezone_name
- Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files. The system default value is UTC+00:00:00
. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No default_charset
utf8
Specifies the default charset when creating a new graph space. No default_collate
utf8_bin
Specifies the default collate when creating a new graph space. No local_config
true
When set to true
, the process gets configurations from the configuration files. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC time.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Graph service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
graphd-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
graphd-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#query_configurations","title":"Query configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications accept_partial_success
false
When set to false
, the process treats partial success as an error. This configuration only applies to read-only requests. Write requests always treat partial success as an error. A partial success query will prompt Got partial result
. Yes session_reclaim_interval_secs
60
Specifies the interval that the Session information is sent to the Meta service. This configuration is measured in seconds. Yes max_allowed_query_size
4194304
Specifies the maximum length of queries. Unit: bytes. The default value is 4194304
, namely 4MB. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Graph Service. The local IP address is used to identify the nebula-graphd process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No listen_netdev
any
Specifies the listening network device. No port
9669
Specifies RPC daemon listening port of the Graph service. No reuse_port
false
When set to false
, the SO_REUSEPORT
is closed. No listen_backlog
1024
Specifies the maximum length of the connection queue for socket monitoring. This configuration must be modified together with the net.core.somaxconn
. No client_idle_timeout_secs
28800
Specifies the time to expire an idle connection. The value ranges from 1 to 604800. The default is 8 hours. This configuration is measured in seconds. No session_idle_timeout_secs
28800
Specifies the time to expire an idle session. The value ranges from 1 to 604800. The default is 8 hours. This configuration is measured in seconds. No num_accept_threads
1
Specifies the number of threads that accept incoming connections. No num_netio_threads
0
Specifies the number of networking IO threads. 0
is the number of CPU cores. No num_max_connections
0
Max active connections for all networking threads. 0 means no limit.Max connections for each networking thread = num_max_connections / num_netio_threads No num_worker_threads
0
Specifies the number of threads that execute queries. 0
is the number of CPU cores. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19669
Specifies the port for the HTTP service. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes storage_client_timeout_ms
- Specifies the RPC connection timeout threshold between the Graph Service and the Storage Service. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is 60000
ms. No slow_query_threshold_us
200000
When the execution time of a query exceeds the value, the query is called a slow query. Unit: Microsecond.Note: Even if the execution time of DML statements exceeds this value, they will not be recorded as slow queries. No ws_meta_http_port
19559
Specifies the Meta service listening port used by the HTTP protocol. It must be consistent with the ws_http_port
in the Meta service configuration file. No Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
enable_authorize
false
When set to false
, the system authentication is not enabled. For more information, see Authentication. No auth_type
password
Specifies the login method. Available values are password
, ldap
, and cloud
. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#memory_configurations","title":"Memory configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications system_memory_high_watermark_ratio
0.8
Specifies the trigger threshold of the high-level memory alarm mechanism. If the system memory usage is higher than this value, an alarm mechanism will be triggered, and NebulaGraph will stop querying. This parameter is not predefined in the initial configuration files. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#metrics_configurations","title":"Metrics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications enable_space_level_metrics
false
Enable or disable space-level metrics. Such metric names contain the name of the graph space that it monitors, for example, query_latency_us{space=basketballplayer}.avg.3600
. You can view the supported metrics with the curl
command. For more information, see Query NebulaGraph metrics. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#session_configurations","title":"Session configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications max_sessions_per_ip_per_user
300
The maximum number of active sessions that can be created from a single IP adddress for a single user. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#experimental_configurations","title":"Experimental configurations","text":"Note
The switch of the experimental feature is only available in the Community Edition.
Name Predefined value Description Whether supports runtime dynamic modificationsenable_experimental_feature
false
Specifies the experimental feature. Optional values are true
and false
. No enable_data_balance
true
Whether to enable the BALANCE DATA feature. Only works when enable_experimental_feature
is true
. No"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#memory_tracker_configurations","title":"Memory tracker configurations","text":"Note
Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Graph service.
/sys/fs/cgroup/graphd/
, and then add and configure the memory.max
file under the directory.Add the following configurations to etc/nebula-graphd.conf
.
--containerized=true\n--cgroup_v2_controllers=/sys/fs/cgroup/graphd/cgroup.controllers\n--cgroup_v2_memory_stat_path=/sys/fs/cgroup/graphd/memory.stat\n--cgroup_v2_memory_max_path=/sys/fs/cgroup/graphd/memory.max\n--cgroup_v2_memory_current_path=/sys/fs/cgroup/graphd/memory.current\n
For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.
Name Predefined value Description Whether supports runtime dynamic modificationsmemory_tracker_limit_ratio
0.8
The value of this parameter can be set to (0, 1]
, 2
, and 3
.Caution: When setting this parameter, ensure that the value of system_memory_high_watermark_ratio
is not set to 1
, otherwise the value of this parameter will not take effect.(0, 1]
: The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory)
.When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio
should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5
.2
: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. 3
: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded. Yes memory_tracker_untracked_reserved_memory_mb
50
The reserved memory that is not tracked by the memory tracker. Unit: MB. Yes memory_tracker_detail_log
false
Whether to enable the memory tracker log. When the value is true
, the memory tracker log is generated. Yes memory_tracker_detail_log_interval_ms
60000
The time interval for generating the memory tracker log. Unit: Millisecond. memory_tracker_detail_log
is true
when this parameter takes effect. Yes memory_purge_enabled
true
Whether to enable the memory purge feature. When the value is true
, the memory purge feature is enabled. Yes memory_purge_interval_seconds
10
The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled
is set to true. Yes"},{"location":"5.configurations-and-logs/1.configurations/3.graph-config/#performance_optimization_configurations","title":"performance optimization configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications max_job_size
1
The maximum number of concurrent jobs, i.e., the maximum number of threads used in the phase of query execution where concurrent execution is possible. It is recommended to be half of the physical CPU cores. Yes min_batch_size
8192
The minimum batch size for processing the dataset. Takes effect only when max_job_size
is greater than 1. Yes optimize_appendvertices
false
When enabled, the MATCH
statement is executed without filtering dangling edges. Yes path_batch_size
10000
The number of paths constructed per thread. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/","title":"Storage Service configurations","text":"NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default
and nebula-storaged.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
local_config
to false
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
. For parameters that are not included in nebula-metad.conf.default
, see nebula-storaged.conf.production
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
Note
The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.
For all parameters and their current values, see Configurations.
"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#basics_configurations","title":"Basics configurations","text":"Name Predefined value Description Whether supports runtime dynamic modificationsdaemonize
true
When set to true
, the process is a daemon process. No pid_file
pids/nebula-storaged.pid
The file that records the process ID. No timezone_name
UTC+00:00:00
Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00
represents the GMT+8 time zone. No local_config
true
When set to true
, the process gets configurations from the configuration files. No Note
timezone_name
. The time-type values returned by nGQL queries are all UTC.timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.log_dir
logs
The directory that stores the Storage service log. It is recommended to put logs on a different hard disk from the data. No minloglevel
0
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs. Yes v
0
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0
, 1
, 2
, 3
, 4
, 5
. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v
. For details, see Verbose Logging. Yes logbufsecs
0
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0
means real-time output. This configuration is measured in seconds. No redirect_stdout
true
When set to true
, the process redirects thestdout
and stderr
to separate output files. No stdout_log_file
graphd-stdout.log
Specifies the filename for the stdout
log. No stderr_log_file
graphd-stderr.log
Specifies the filename for the stderr
log. No stderrthreshold
3
Specifies the minloglevel
to be copied to the stderr
log. No timestamp_in_logfile_name
true
Specifies if the log file name contains a timestamp. true
indicates yes, false
indicates no. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#networking_configurations","title":"Networking configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications meta_server_addrs
127.0.0.1:9559
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No local_ip
127.0.0.1
Specifies the local IP (or hostname) for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No port
9779
Specifies RPC daemon listening port of the Storage service. The neighboring ports -1
(9778
) and +1
(9780
) are also used. 9778
: The port used by the Admin service, which receives Meta commands for Storage. 9780
: The port used for Raft communication between Storage services. No ws_ip
0.0.0.0
Specifies the IP address for the HTTP service. No ws_http_port
19779
Specifies the port for the HTTP service. No heartbeat_interval_secs
10
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs
values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
raft_heartbeat_interval_secs
30
Specifies the time to expire the Raft election. The configuration is measured in seconds. Yes raft_rpc_timeout_ms
500
Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds. Yes wal_ttl
14400
Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#disk_configurations","title":"Disk configurations","text":"Name Predefined value Description Whether supports runtime dynamic modifications data_path
data/storage
Specifies the data storage path. Multiple paths are separated with commas. For NebulaGraph of the community edition, one RocksDB instance corresponds to one path. No minimum_reserved_bytes
268435456
Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes. No rocksdb_batch_size
4096
Specifies the block cache for a batch operation. The configuration is measured in bytes. No rocksdb_block_cache
4
Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes. No disable_page_cache
false
Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is false
and page cache is enabled. If the value is set to true
, page cache is disabled and sufficient block cache space must be configured for NebulaGraph. No engine_type
rocksdb
Specifies the engine type. No rocksdb_compression
lz4
Specifies the compression algorithm for RocksDB. Optional values are no
, snappy
, lz4
, lz4hc
, zlib
, bzip2
, and zstd
.This parameter modifies the compression algorithm for each level. If you want to set different compression algorithms for each level, use the parameter rocksdb_compression_per_level
. No rocksdb_compression_per_level
\\ Specifies the compression algorithm for each level. The priority is higher than rocksdb_compression
. For example, no:no:lz4:lz4:snappy:zstd:snappy
.You can also not set certain levels of compression algorithms, for example, no:no:lz4:lz4::zstd
, level L4 and L6 use the compression algorithm of rocksdb_compression
. No enable_rocksdb_statistics
false
When set to false
, RocksDB statistics is disabled. No rocksdb_stats_level
kExceptHistogramOrTimers
Specifies the stats level for RocksDB. Optional values are kExceptHistogramOrTimers
, kExceptTimers
, kExceptDetailedTimers
, kExceptTimeForMutex
, and kAll
. No enable_rocksdb_prefix_filtering
true
When set to true
, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory. No enable_rocksdb_whole_key_filtering
false
When set to true
, the whole key bloom filter for RocksDB is enabled. rocksdb_filtering_prefix_length
12
Specifies the prefix length for each key. Optional values are 12
and 16
. The configuration is measured in bytes. No enable_partitioned_index_filter
false
When set to true
, it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#rocksdb_options","title":"RocksDB options","text":"Name Predefined value Description Whether supports runtime dynamic modifications rocksdb_db_options
{}
Specifies the RocksDB database options. No rocksdb_column_family_options
{\"write_buffer_size\":\"67108864\",
\"max_write_buffer_number\":\"4\",
\"max_bytes_for_level_base\":\"268435456\"}
Specifies the RocksDB column family options. No rocksdb_block_based_table_options
{\"block_size\":\"8192\"}
Specifies the RocksDB block based table options. No The format of the RocksDB option is {\"<option_name>\":\"<option_value>\"}
. Multiple options are separated with commas.
Supported options of rocksdb_db_options
and rocksdb_column_family_options
are listed as follows.
rocksdb_db_options
max_total_wal_size\ndelete_obsolete_files_period_micros\nmax_background_jobs\nstats_dump_period_sec\ncompaction_readahead_size\nwritable_file_max_buffer_size\nbytes_per_sync\nwal_bytes_per_sync\ndelayed_write_rate\navoid_flush_during_shutdown\nmax_open_files\nstats_persist_period_sec\nstats_history_buffer_size\nstrict_bytes_per_sync\nenable_rocksdb_prefix_filtering\nenable_rocksdb_whole_key_filtering\nrocksdb_filtering_prefix_length\nnum_compaction_threads\nrate_limit\n
rocksdb_column_family_options
write_buffer_size\nmax_write_buffer_number\nlevel0_file_num_compaction_trigger\nlevel0_slowdown_writes_trigger\nlevel0_stop_writes_trigger\ntarget_file_size_base\ntarget_file_size_multiplier\nmax_bytes_for_level_base\nmax_bytes_for_level_multiplier\ndisable_auto_compactions \n
For more information, see RocksDB official documentation.
"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#misc_configurations","title":"Misc configurations","text":"Caution
The configuration snapshot
in the following table is different from the snapshot in NebulaGraph. The snapshot
here refers to the stock data on the leader when synchronizing Raft.
query_concurrently
true
Whether to turn on multi-threaded queries. Enabling it can improve the latency performance of individual queries, but it will reduce the overall throughput under high pressure. Yes auto_remove_invalid_space
true
After executing DROP SPACE
, the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is true
, all the data in the specified graph space will be deleted at the same time. Yes num_io_threads
16
The number of network I/O threads used to send RPC requests and receive responses. No num_max_connections
0
Max active connections for all networking threads. 0 means no limit.Max connections for each networking thread = num_max_connections / num_netio_threads No num_worker_threads
32
The number of worker threads for one RPC-based Storage service. No max_concurrent_subtasks
10
The maximum number of concurrent subtasks to be executed by the task manager. No snapshot_part_rate_limit
10485760
The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s. Yes snapshot_batch_size
1048576
The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes. Yes rebuild_index_part_rate_limit
4194304
The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s. Yes rebuild_index_batch_size
1048576
The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#memory_tracker_configurations","title":"Memory Tracker configurations","text":"Note
Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Storage service.
/sys/fs/cgroup/storaged/
, and then add and configure the memory.max
file under the directory.Add the following configurations to etc/nebula-storaged.conf
.
--containerized=true\n--cgroup_v2_controllers=/sys/fs/cgroup/graphd/cgroup.controllers\n--cgroup_v2_memory_stat_path=/sys/fs/cgroup/graphd/memory.stat\n--cgroup_v2_memory_max_path=/sys/fs/cgroup/graphd/memory.max\n--cgroup_v2_memory_current_path=/sys/fs/cgroup/graphd/memory.current\n
For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.
Name Predefined value Description Whether supports runtime dynamic modificationsmemory_tracker_limit_ratio
0.8
The value of this parameter can be set to (0, 1]
, 2
, and 3
.(0, 1]
: The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory)
.When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio
should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5
.2
: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. 3
: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded. Yes memory_tracker_untracked_reserved_memory_mb
50
The reserved memory that is not tracked by the Memory Tracker. Unit: MB. Yes memory_tracker_detail_log
false
Whether to enable the Memory Tracker log. When the value is true
, the Memory Tracker log is generated. Yes memory_tracker_detail_log_interval_ms
60000
The time interval for generating the Memory Tracker log. Unit: Millisecond. memory_tracker_detail_log
is true
when this parameter takes effect. Yes memory_purge_enabled
true
Whether to enable the memory purge feature. When the value is true
, the memory purge feature is enabled. Yes memory_purge_interval_seconds
10
The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled
is set to true. Yes"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#for_super-large_vertices","title":"For super-Large vertices","text":"When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex
parameter. Excess edges will not be returned. This parameter applies to all spaces.
2147483647
Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No"},{"location":"5.configurations-and-logs/1.configurations/4.storage-config/#storage_configurations_for_large_dataset","title":"Storage configurations for large dataset","text":"Warning
One graph space takes up at least about 300 MB of memory.
When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter
parameter to true
. The performance is affected because RocksDB indexes are cached.
This topic introduces the Kernel configurations in Nebula\u00a0Graph.
"},{"location":"5.configurations-and-logs/1.configurations/6.kernel-config/#resource_control","title":"Resource control","text":"You may run the ulimit
command to control the resource threshold. However, the changes made only take effect for the current session or sub-process. To make permanent changes, edit file /etc/security/limits.conf
. The configuration is as follows:
# <domain> <type> <item> <value>\n* soft core unlimited \n* hard core unlimited \n* soft nofile 130000 \n* hard nofile 130000\n
Note
The configuration modification takes effect for new sessions.
The parameter descriptions are as follows.
Parameter Descriptiondomain
Control Domain. This parameter can be a user name, a user group name (starting with @
), or *
to indicate all users. type
Control type. This parameter can be soft
or hard
. soft
indicates a soft threshold (the default threshold) for the resource and hard
indicates a maximum value that can be set by the user. The ulimit
command can be used to increase soft
, but not to exceed hard
. item
Resource types. For example, core
limits the size of the core dump file, and nofile
limits the maximum number of file descriptors a process can open. value
Resource limit value. This parameter can be a number, or unlimited
to indicate that there is no limit. You can run man limits.conf
for more helpful information.
vm.swappiness
specifies the percentage of the available memory before starting swap. The greater the value, the more likely the swap occurs. We recommend that you set it to 0. When set to 0, the page cache is removed first. Note that when vm.swappiness
is 0, it does not mean that there is no swap.
vm.min_free_kbytes
specifies the minimum number of kilobytes available kept by Linux VM. If you have a large system memory, we recommend that you increase this value. For example, if your physical memory 128GB, set it to 5GB. If the value is not big enough, the system cannot apply for enough continuous physical memory.
vm.max_map_count
limits the maximum number of vma (virtual memory area) for a process. The default value is 65530
. It is enough for most applications. If your memory application fails because the memory consumption is large, increase the vm.max_map_count
value.
These values control the dirty data cache for the system. For write-intensive scenarios, you can make adjustments based on your needs (throughput priority or delay priority). We recommend that you use the system default value.
"},{"location":"5.configurations-and-logs/1.configurations/6.kernel-config/#transparent_huge_pages","title":"Transparent Huge Pages","text":"Transparent Huge Pages (THP) is a memory management feature of the Linux kernel, which enhances the system's ability to use large pages. In most database systems, Transparent Huge Pages can degrade performance, so it is recommended to disable it.
Perform the following steps:
Edit the GRUB configuration file /etc/default/grub
.
sudo vi /etc/default/grub\n
Add transparent_hugepage=never
to the GRUB_CMDLINE_LINUX
option, and then save and exit.
GRUB_CMDLINE_LINUX=\"... transparent_hugepage=never\"\n
Update the GRUB configuration.
For CentOS:
sudo grub2-mkconfig -o /boot/grub2/grub.cfg\n
For Ubuntu:
sudo update-grub\n
Reboot the computer.
sudo reboot\n
If you don't want to reboot, you can run the following commands to temporarily disable THP until the next reboot.
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled\necho 'never' > /sys/kernel/mm/transparent_hugepage/defrag\n
The default value of net.ipv4.tcp_slow_start_after_idle
is 1
. If set, the congestion window is timed out after an idle period. We recommend that you set it to 0
, especially for long fat scenarios (high latency and large bandwidth).
net.core.somaxconn
specifies the maximum number of connection queues listened by the socket. The default value is 128
. For scenarios with a large number of burst connections, we recommend that you set it to greater than 1024
.
net.ipv4.tcp_max_syn_backlog
specifies the maximum number of TCP connections in the SYN_RECV (semi-connected) state. The setting rule for this parameter is the same as that of net.core.somaxconn
.
net.core.netdev_max_backlog
specifies the maximum number of packets. The default value is 1000
. We recommend that you increase it to greater than 10,000
, especially for 10G network adapters.
These values keep parameters alive for TCP connections. For applications that use a 4-layer transparent load balancer, if the idle connection is disconnected unexpectedly, decrease the values of tcp_keepalive_time
and tcp_keepalive_intvl
.
net.ipv4.tcp_wmem/rmem
specifies the minimum, default, and maximum size of the buffer pool sent/received by the TCP socket. For long fat links, we recommend that you increase the default value to bandwidth (GB) * RTT (ms)
.
For SSD devices, we recommend that you set scheduler
to noop
or none
. The path is /sys/block/DEV_NAME/queue/scheduler
.
we recommend that you set it to core
and set kernel.core_uses_pid
to 1
.
sysctl <conf_name>
Checks the current parameter value.
sysctl -w <conf_name>=<value>
Modifies the parameter value. The modification takes effect immediately. The original value is restored after restarting.
sysctl -p [<file_path>]
Loads Linux parameter values \u200b\u200bfrom the specified configuration file. The default path is /etc/sysctl.conf
.
The prlimit
command gets and sets process resource limits. You can modify the hard threshold by using it and the sudo
command. For example, prlimit --nofile = 130000 --pid = $$
adjusts the maximum number of open files permitted by the current process to 14000
. And the modification takes effect immediately. Note that this command is only available in RedHat 7u or higher versions.
Runtime logs are provided for DBAs and developers to locate faults when the system fails.
NebulaGraph uses glog to print runtime logs, uses gflags to control the severity level of the log, and provides an HTTP interface to dynamically change the log level at runtime to facilitate tracking.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#log_directory","title":"Log directory","text":"The default runtime log directory is /usr/local/nebula/logs/
.
If the log directory is deleted while NebulaGraph is running, the log would not continue to be printed. However, this operation will not affect the services. To recover the logs, restart the services.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#parameter_descriptions","title":"Parameter descriptions","text":"minloglevel
: Specifies the minimum level of the log. That is, no logs below this level will be printed. Optional values are 0
(INFO), 1
(WARNING), 2
(ERROR), 3
(FATAL). It is recommended to set it to 0
during debugging and 1
in a production environment. If it is set to 4
, NebulaGraph will not print any logs.v
: Specifies the detailed level of the log. The larger the value, the more detailed the log is. Optional values are 0
, 1
, 2
, 3
.The default severity level for the metad, graphd, and storaged logs can be found in their respective configuration files. The default path is /usr/local/nebula/etc/
.
Check all the flag values (log values included) of the current gflags with the following command.
$ curl <ws_ip>:<ws_port>/flags\n
Parameter Description ws_ip
The IP address for the HTTP service, which can be found in the configuration files above. The default value is 127.0.0.1
. ws_port
The port for the HTTP service, which can be found in the configuration files above. The default values are 19559
(Meta), 19669
(Graph), and 19779
(Storage) respectively. Examples are as follows:
minloglevel
in the Meta service:$ curl 127.0.0.1:19559/flags | grep 'minloglevel'\n
v
in the Storage service:$ curl 127.0.0.1:19779/flags | grep -w 'v'\n
Change the severity level of the log with the following command.
$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"<key>\":<value>[,\"<key>\":<value>]}' \"<ws_ip>:<ws_port>/flags\"\n
Parameter Description key
The type of the log to be changed. For optional values, see Parameter descriptions. value
The level of the log. For optional values, see Parameter descriptions. ws_ip
The IP address for the HTTP service, which can be found in the configuration files above. The default value is 127.0.0.1
. ws_port
The port for the HTTP service, which can be found in the configuration files above. The default values are 19559
(Meta), 19669
(Graph), and 19779
(Storage) respectively. Examples are as follows:
$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19779/flags\" # storaged\n$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19669/flags\" # graphd\n$ curl -X PUT -H \"Content-Type: application/json\" -d '{\"minloglevel\":0,\"v\":3}' \"127.0.0.1:19559/flags\" # metad\n
If the log level is changed while NebulaGraph is running, it will be restored to the level set in the configuration file after restarting the service. To permanently modify it, see Configuration files.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#rocksdb_runtime_logs","title":"RocksDB runtime logs","text":"RocksDB runtime logs are usually used to debug RocksDB parameters and stored in /usr/local/nebula/data/storage/nebula/$id/data/LOG
. $id
is the ID of the example.
Glog does not inherently support log recycling. To implement this feature, you can either use cron jobs in Linux to regularly remove old log files or use the log management tool, logrotate, to rotate logs for regular archiving and deletion.
"},{"location":"5.configurations-and-logs/2.log-management/logs/#log_recycling_using_cron_jobs","title":"Log recycling using cron jobs","text":"This section provides an example of how to use cron jobs to regularly delete old log files from the Graph service's runtime logs.
In the Graph service configuration file, apply the following settings and restart the service:
timestamp_in_logfile_name = true\nmax_log_size = 500\n
timestamp_in_logfile_name
to true
, the log file name includes a timestamp, allowing regular deletion of old log files.max_log_size
parameter sets the maximum size of a single log file in MB, such as 500
. Once this size is exceeded, a new log file is automatically created. The default value is 1800
.Use the following command to open the cron job editor.
crontab -e\n
Add a cron job command to the editor to regularly delete old log files.
* * * * * find <log_path> -name \"<YourProjectName>\" -mtime +7 -delete\n
Caution
The find
command in the above command should be executed by the root user or a user with sudo privileges.
* * * * *
: This cron job time field signifies that the task is executed every minute. For other settings, see Cron Expression.<log_path>
: The path of the service runtime log file, such as /usr/local/nebula/logs
.<YourProjectName>
: The log file name, such as nebula-graphd.*
.-mtime +7
: This deletes log files that are older than 7 days. Alternatively, use -mmin +n
to delete log files older than n
minutes. For details, see the find command.-delete
: This deletes log files that meet the conditions.For example, to automatically delete the Graph service runtime log files older than 7 days at 3 o'clock every morning, use:
0 3 * * * find /usr/local/nebula/logs -name nebula-graphd.* -mtime +7 -delete\n
Save the cron job and exit the editor.
Logrotate is a tool that can rotate specified log files for archiving and recycling.
Note
You must be the root user or a user with sudo privileges to install or run logrotate.
This section provides an example of how to use logrotate to manage the Graph service's INFO
level log file (/usr/local/nebula/logs/nebula-graphd.INFO.impl
).
In the Graph service configuration file, set timestamp_in_logfile_name
to false
so that the logrotate tool can recognize the log file name. Then, restart the service.
timestamp_in_logfile_name = false\n
Install logrotate.
For Debian/Ubuntu:
sudo apt-get install logrotate\n
For CentOS/RHEL:
sudo yum install logrotate\n
Create a logrotate configuration file, add log rotation rules, and save the configuration file.
In the /etc/logrotate.d
directory, create a new logrotate configuration file nebula-graphd.INFO
.
sudo vim /etc/logrotate.d/nebula-graphd.INFO\n
Then, add the following content:
# The absolute path of the log file needs to be configured\n# And the file name cannot be a symbolic link file, such as `nebula-graph.INFO`\n/usr/local/nebula/logs/nebula-graphd.INFO.impl {\n daily\n rotate 2\n copytruncate\n nocompress\n missingok\n notifempty\n create 644 root root\n dateext\n dateformat .%Y-%m-%d-%s\n maxsize 1k\n}\n
Parameter Description daily
Rotate the log daily. Other available time units include hourly
, daily
, weekly
, monthly
, and yearly
. rotate 2
Keep the most recent 2 log files before deleting the older one. copytruncate
Copy the current log file and then truncate it, ensuring no disruption to the logging process. nocompress
Do not compress the old log files. missingok
Do not report errors if the log file is missing. notifempty
Do not rotate the log file if it's empty. create 644 root root
Create a new log file with the specified permissions and ownership. dateext
Add a date extension to the log file name. The default is the current date in the format -%Y%m%d
. You can extend this using the dateformat
option. dateformat .%Y-%m-%d-%s
This must follow immediately after dateext
and defines the file name after log rotation. Before V3.9.0, only %Y
, %m
, %d
, and %s
parameters were supported. Starting from V3.9.0, the %H
parameter is also supported. maxsize 1k
Rotate the log when it exceeds 1 kilobyte (1024
bytes) in size or when the specified time unit (e.g., daily
) passes. You can use size units like k
and M
, with the default unit being bytes. Modify the parameters in the configuration file according to actual needs. For more information about parameter configuration, see logrotate.
Test the logrotate configuration.
To verify whether the logrotate configuration is correct, use the following command for testing.
sudo logrotate --debug /etc/logrotate.d/nebula-graphd.INFO\n
Execute logrotate.
Although logrotate
is typically executed automatically by cron jobs, you can manually execute the following command to perform log rotation immediately.
sudo logrotate -fv /etc/logrotate.d/nebula-graphd.INFO\n
-fv
: f
stands for forced execution, v
stands for verbose output.
Verify the log rotation results.
After log rotation, new log files are found in the /usr/local/nebula/logs
directory, such as nebula-graphd.INFO.impl.2024-01-04-1704338204
. The original log content is cleared, but the file is retained for new log entries. When the number of log files exceeds the value set by rotate
, the oldest log file is deleted.
For example, rotate
2` means keeping the 2 most recently generated log files. When the number of log files exceeds 2, the oldest log file is deleted.
[test@test logs]$ ll\n-rw-r--r-- 1 root root 0 Jan 4 11:18 nebula-graphd.INFO.impl \n-rw-r--r-- 1 root root 6894 Jan 4 11:16 nebula-graphd.INFO.impl.2024-01-04-1704338204 # This file is deleted when a new log file is generated\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338287\n[test@test logs]$ ll\n-rw-r--r-- 1 root root 0 Jan 4 11:18 nebula-graphd.INFO.impl\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338287\n-rw-r--r-- 1 root root 222 Jan 4 11:18 nebula-graphd.INFO.impl.2024-01-04-1704338339 # The new log file is generated\n
If you need to rotate multiple log files, create multiple configuration files in the /etc/logrotate.d
directory, with each configuration file corresponding to a log file. For example, to rotate the INFO
level log file and the WARNING
level log file of the Meta service, create two configuration files nebula-metad.INFO
and nebula-metad.WARNING
, and add log rotation rules in them respectively.
NebulaGraph supports querying the monitoring metrics through HTTP ports.
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#metrics_structure","title":"Metrics structure","text":"Each metric of NebulaGraph consists of three fields: name, type, and time range. The fields are separated by periods, for example, num_queries.sum.600
. Different NebulaGraph services (Graph, Storage, or Meta) support different metrics. The detailed description is as follows.
num_queries
Indicates the function of the metric. Metric type sum
Indicates how the metrics are collected. Supported types are SUM, AVG, RATE, and the P-th sample quantiles such as P75, P95, P99, and P999. Time range 600
The time range in seconds for the metric collection. Supported values are 5, 60, 600, and 3600, representing the last 5 seconds, 1 minute, 10 minutes, and 1 hour."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_metrics_over_http","title":"Query metrics over HTTP","text":""},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#syntax","title":"Syntax","text":"curl -G \"http://<host>:<port>/stats?stats=<metric_name_list> [&format=json]\"\n
Parameter Description host
The IP (or hostname) of the server. You can find it in the configuration file in the installation directory. port
The HTTP port of the server. You can find it in the configuration file in the installation directory. The default ports are 19559 (Meta), 19669 (Graph), and 19779 (Storage). metric_name_list
The metrics names. Multiple metrics are separated by commas (,). &format=json
Optional. Returns the result in the JSON format. Note
If NebulaGraph is deployed with Docker Compose, run docker-compose ps
to check the ports that are mapped from the service ports inside of the container and then query through them.
Query the query number in the last 10 minutes in the Graph Service.
$ curl -G \"http://192.168.8.40:19669/stats?stats=num_queries.sum.600\"\nnum_queries.sum.600=400\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_multiple_metrics","title":"Query multiple metrics","text":"Query the following metrics together:
The average latency of the slowest 1% heartbeats, i.e., the P99 heartbeats, in the last 10 minutes.
$ curl -G \"http://192.168.8.40:19559/stats?stats=heartbeat_latency_us.avg.60,heartbeat_latency_us.p99.600\"\nheartbeat_latency_us.avg.60=281\nheartbeat_latency_us.p99.600=985\n
Query the number of new vertices in the Storage Service in the last 10 minutes and return the result in the JSON format.
$ curl -G \"http://192.168.8.40:19779/stats?stats=num_add_vertices.sum.600&format=json\"\n[{\"value\":1,\"name\":\"num_add_vertices.sum.600\"}]\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#query_all_metrics_in_a_service","title":"Query all metrics in a service.","text":"If no metric is specified in the query, NebulaGraph returns all metrics in the service.
$ curl -G \"http://192.168.8.40:19559/stats\"\nheartbeat_latency_us.avg.5=304\nheartbeat_latency_us.avg.60=308\nheartbeat_latency_us.avg.600=299\nheartbeat_latency_us.avg.3600=285\nheartbeat_latency_us.p75.5=652\nheartbeat_latency_us.p75.60=669\nheartbeat_latency_us.p75.600=651\nheartbeat_latency_us.p75.3600=642\nheartbeat_latency_us.p95.5=930\nheartbeat_latency_us.p95.60=963\nheartbeat_latency_us.p95.600=933\nheartbeat_latency_us.p95.3600=929\nheartbeat_latency_us.p99.5=986\nheartbeat_latency_us.p99.60=1409\nheartbeat_latency_us.p99.600=989\nheartbeat_latency_us.p99.3600=986\nnum_heartbeats.rate.5=0\nnum_heartbeats.rate.60=0\nnum_heartbeats.rate.600=0\nnum_heartbeats.rate.3600=0\nnum_heartbeats.sum.5=2\nnum_heartbeats.sum.60=40\nnum_heartbeats.sum.600=394\nnum_heartbeats.sum.3600=2364\n...\n
"},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#space-level_metrics","title":"Space-level metrics","text":"The Graph service supports a set of space-level metrics that record the information of different graph spaces separately.
Space-level metrics can be queried only by querying all metrics. For example, run curl -G \"http://192.168.8.40:19559/stats\"
to show all metrics. The returned result contains the graph space name in the form of '{space=space_name}', such as num_active_queries{space=basketballplayer}.sum.5=0
.
Caution
To enable space-level metrics, set the value of enable_space_level_metrics
to true
in the Graph service configuration file before starting NebulaGraph. For details about how to modify the configuration, see Configuration Management.
num_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"6.monitor-and-metrics/1.query-performance-metrics/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"6.monitor-and-metrics/2.rocksdb-statistics/","title":"RocksDB statistics","text":"NebulaGraph uses RocksDB as the underlying storage. This topic describes how to collect and show the RocksDB statistics of NebulaGraph.
"},{"location":"6.monitor-and-metrics/2.rocksdb-statistics/#enable_rocksdb","title":"Enable RocksDB","text":"By default, the function of RocksDB statistics is disabled. To enable RocksDB statistics, you need to:
Modify the --enable_rocksdb_statistics
parameter as true
in the nebula-storaged.conf
file. The default path of the configuration file is /use/local/nebula/etc
.
Restart the service to make the modification valid.
Users can use the built-in HTTP service in the storage service to get the following types of statistics. Results in the JSON format are supported.
Use the following command to get all RocksDB statistics:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats\"\n
For example:
curl -L \"http://172.28.2.1:19779/rocksdb_stats\"\n\nrocksdb.blobdb.blob.file.bytes.read=0\nrocksdb.blobdb.blob.file.bytes.written=0\nrocksdb.blobdb.blob.file.bytes.synced=0\n...\n
Use the following command to get specified RocksDB statistics:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats?stats=${stats_name}\"\n
For example, use the following command to get the information of rocksdb.bytes.read
and rocksdb.block.cache.add
.
curl -L \"http://172.28.2.1:19779/rocksdb_stats?stats=rocksdb.bytes.read,rocksdb.block.cache.add\"\n\nrocksdb.block.cache.add=14\nrocksdb.bytes.read=1632\n
Use the following command to get specified RocksDB statistics in the JSON format:
curl -L \"http://${storage_ip}:${port}/rocksdb_stats?stats=${stats_name}&format=json\"\n
For example, use the following command to get the information of rocksdb.bytes.read
and rocksdb.block.cache.add
and return the results in the JSON format.
curl -L \"http://172.28.2.1:19779/rocksdb_stats?stats=rocksdb.bytes.read,rocksdb.block.cache.add&format=json\"\n\n[\n {\n \"rocksdb.block.cache.add\": 1\n },\n {\n \"rocksdb.bytes.read\": 160\n }\n]\n
"},{"location":"7.data-security/4.ssl/","title":"SSL encryption","text":"NebulaGraph supports SSL encrypted transfers between the Client, Graph Service, Meta Service, and Storage Service, and this topic describes how to set up SSL encryption.
"},{"location":"7.data-security/4.ssl/#precaution","title":"Precaution","text":"Enabling SSL encryption will slightly affect the performance, such as causing operation latency.
"},{"location":"7.data-security/4.ssl/#certificate_modes","title":"Certificate modes","text":"To use SSL encryption, SSL certificates are required. NebulaGraph supports two certificate modes.
Self-signed certificate mode
A certificate that is generated by the server itself and signed by itself. In the self-signed certificate mode, the server needs to generate its own SSL certificate and key, and then use its own private key to sign the certificate. It is suitable for building secure communications for systems and applications within a LAN.
CA-signed certificate mode
A certificate granted by a trusted third-party Certificate Authority (CA). In the CA signed certificate mode, the server needs to apply for an SSL certificate from a trusted CA and ensure the authenticity and trustworthiness of the certificate through the auditing and signing of the certificate authority center. It is suitable for public network environment, especially for websites, e-commerce and other occasions that need to protect user information security.
Policies for the NebulaGraph community edition.
Scene TLS External device access to Graph Modify the Graph configuration file to add the following parameters:--enable_graph_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access Meta In the Graph/Meta configuration file, add the following parameters:--enable_meta_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_storage_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
Graph access Meta/StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_meta_ssl = true
--enable_storage_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
External device access to GraphGraph access Meta/StorageMeta access Storage In the Graph/Meta/Storage configuration file, add the following parameters:--enable_ssl = true
--ca_path=xxxxxx
--cert_path=xxxxxx
--key_path=xxxxxx
The parameters are described below.
Parameter Default value Descriptioncert_path
- The path to the SSL public key certificate. This certificate is usually a .pem
or .crt
file, which is used to prove the identity of the server side, and contains information such as the public key, certificate owner, digital signature, and so on. key_path
- The path to the SSL key. The SSL key is usually a .key
file. password_path
- (Optional) The path to the password file for the SSL key. Some SSL keys are encrypted and require a corresponding password to decrypt. We need to store the password in a separate file and use this parameter to specify the path to the password file. ca_path
- The path to the SSL root certificate. The root certificate is a special SSL certificate that is considered the highest level in the SSL trust chain and is used to validate and authorize other SSL certificates. enable_ssl
false
Whether to enable SSL encryption in all services. only. enable_graph_ssl
false
Whether to enable SSL encryption in the Graph service only. enable_meta_ssl
false
Whether to enable SSL encryption in the Meta service only. enable_storage_ssl
false
Whether to enable SSL encryption in the Storage service only."},{"location":"7.data-security/4.ssl/#example_of_tls","title":"Example of TLS","text":"For example, using self-signed certificates and TLS for data transfers between the client NebulaGraph Python, the Graph service, the Meta service, and the Storage service. You need to set up all three Graph/Meta/Storage configuration files as follows:
--enable_ssl=true\n--ca_path=xxxxxx\n--cert_path=xxxxxx\n--key_path=xxxxxx\n
When the changes are complete, restart these services to make the configuration take effect.
To connect to the Graph service using NebulaGraph Python, you need to set up a secure socket and add a trusted CA. For code examples, see nebula-test-run.py.
NebulaGraph replies on local authentication to implement access control.
NebulaGraph creates a session when a client connects to it. The session stores information about the connection, including the user information. If the authentication system is enabled, the session will be mapped to corresponding users.
Note
By default, the authentication is disabled and NebulaGraph allows connections with the username root
and any password.
Local authentication indicates that usernames and passwords are stored locally on the server, with the passwords encrypted. Users will be authenticated when trying to visit NebulaGraph.
"},{"location":"7.data-security/1.authentication/1.authentication/#enable_local_authentication","title":"Enable local authentication","text":"Modify the nebula-graphd.conf
file (/usr/local/nebula/etc/
is the default path) to set the following parameters:
--enable_authorize
: Set its value to true
to enable authentication.
Note
root
and any password.root
and password nebula
to log into NebulaGraph after enabling local authentication. This account has the build-in God role. For more information about roles, see Roles and privileges.--failed_login_attempts
: This parameter is optional, and you need to add this parameter manually. Specify the attempts of continuously entering incorrect passwords for a single Graph service. When the number exceeds the limitation, your account will be locked. For multiple Graph services, the allowed attempts are number of services * failed_login_attempts
.--password_lock_time_in_secs
: This parameter is optional, and you need to add this parameter manually. Specify the time how long your account is locked after multiple incorrect password entries are entered. Unit: second.Restart the NebulaGraph services. For how to restart, see Manage NebulaGraph services.
User management is an indispensable part of NebulaGraph access control. This topic describes how to manage users and roles.
After enabling authentication, only valid users can connect to NebulaGraph and access the resources according to the user roles.
Note
root
and any password.The root
user with the GOD role can run CREATE USER
to create a new user.
Syntax
CREATE USER [IF NOT EXISTS] <user_name> [WITH PASSWORD '<password>'];\n
IF NOT EXISTS
: Detects if the user name exists. The user will be created only if the user name does not exist.user_name
: Sets the name of the user. The maximum length is 16 characters.password
: Sets the password of the user. The default password is the empty string (''
). The maximum length is 24 characters.Example
nebula> CREATE USER user1 WITH PASSWORD 'nebula';\nnebula> SHOW USERS;\n+---------+-------------------------------+\n| Account | IP Whitelist |\n+---------+-------------------------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n+---------+-------------------------------+\n
Users with the GOD role or the ADMIN role can run GRANT ROLE
to assign a built-in role in a graph space to a user. For more information about NebulaGraph built-in roles, see Roles and privileges.
Syntax
GRANT ROLE <role_type> ON <space_name> TO <user_name>;\n
Example
nebula> GRANT ROLE USER ON basketballplayer TO user1;\n
Users with the GOD role or the ADMIN role can run REVOKE ROLE
to revoke the built-in role of a user in a graph space. For more information about NebulaGraph built-in roles, see Roles and privileges.
Syntax
REVOKE ROLE <role_type> ON <space_name> FROM <user_name>;\n
Example
nebula> REVOKE ROLE USER ON basketballplayer FROM user1;\n
Users can run DESCRIBE USER
to list the roles for a specified user.
Syntax
DESCRIBE USER <user_name>;\nDESC USER <user_name>;\n
Example
nebula> DESCRIBE USER user1;\n+---------+--------------------+\n| role | space |\n+---------+--------------------+\n| \"ADMIN\" | \"basketballplayer\" |\n+---------+--------------------+\n
Users can run SHOW ROLES
to list the roles in a graph space.
Syntax
SHOW ROLES IN <space_name>;\n
Example
nebula> SHOW ROLES IN basketballplayer;\n+---------+-----------+\n| Account | Role Type |\n+---------+-----------+\n| \"user1\" | \"ADMIN\" |\n+---------+-----------+\n
Users can run CHANGE PASSWORD
to set a new password for a user. The old password is needed when setting a new one.
Syntax
CHANGE PASSWORD <user_name> FROM '<old_password>' TO '<new_password>';\n
Example
nebula> CHANGE PASSWORD user1 FROM 'nebula' TO 'nebula123';\n
The root
user with the GOD role can run ALTER USER
to set a new password. The old password is not needed when altering the user.
Syntax
ALTER USER <user_name> WITH PASSWORD '<password>';\n
- Example nebula> ALTER USER user2 WITH PASSWORD 'nebula';\n
The root
user with the GOD role can run DROP USER
to remove a user.
Note
Removing a user does not close the current session of the user, and the user role still takes effect in the session until the session is closed.
Syntax
DROP USER [IF EXISTS] <user_name>;\n
Example
nebula> DROP USER user1;\n
The root
user with the GOD role can run SHOW USERS
to list all the users.
Syntax
SHOW USERS;\n
Example
nebula> SHOW USERS;\n+---------+-----------------+\n| Account | IP Whitelist |\n+---------+-----------------+\n| \"root\" | \"\" |\n| \"user1\" | \"\" |\n| \"user2\" | \"192.168.10.10\" |\n+---------+-----------------+\n
A role is a collection of privileges. You can assign a role to a user for access control.
"},{"location":"7.data-security/1.authentication/3.role-list/#built-in_roles","title":"Built-in roles","text":"NebulaGraph does not support custom roles, but it has multiple built-in roles:
GOD
root
in Linux and administrator
in Windows.root
is automatically created with the password nebula
.Caution
Modify the password for root
timely for security.
When the --enable_authorize
parameter in the nebula-graphd.conf
file (the default directory is /usr/local/nebula/etc/
) is set to true
:
root
user with the default God role can be used.ADMIN
An ADMIN role of a graph space can grant DBA, USER, and GUEST roles in the graph space to other users.
Note
Only roles lower than ADMIN can be authorized to other users.
DBA
USER
Note
The privileges of roles and the nGQL statements that each role can use are listed as follows.
Privilege God Admin DBA User Guest Allowed nGQL Read space Y Y Y Y YUSE
, DESCRIBE SPACE
Read schema Y Y Y Y Y DESCRIBE TAG
, DESCRIBE EDGE
, DESCRIBE TAG INDEX
, DESCRIBE EDGE INDEX
Write schema Y Y Y Y CREATE TAG
, ALTER TAG
, CREATE EDGE
, ALTER EDGE
, DROP TAG
, DELETE TAG
, DROP EDGE
, CREATE TAG INDEX
, CREATE EDGE INDEX
, DROP TAG INDEX
, DROP EDGE INDEX
Write user Y CREATE USER
, DROP USER
, ALTER USER
Write role Y Y GRANT
, REVOKE
Read data Y Y Y Y Y GO
, SET
, PIPE
, MATCH
, ASSIGNMENT
, LOOKUP
, YIELD
, ORDER BY
, FETCH VERTICES
, Find
, FETCH EDGES
, FIND PATH
, LIMIT
, GROUP BY
, RETURN
Write data Y Y Y Y INSERT VERTEX
, UPDATE VERTEX
, INSERT EDGE
, UPDATE EDGE
, DELETE VERTEX
, DELETE EDGES
, DELETE TAG
Show operations Y Y Y Y Y SHOW
, CHANGE PASSWORD
Job Y Y Y Y SUBMIT JOB COMPACT
, SUBMIT JOB FLUSH
, SUBMIT JOB STATS
, STOP JOB
, RECOVER JOB
, BUILD TAG INDEX
, BUILD EDGE INDEX
,INGEST
, DOWNLOAD
Write space Y CREATE SPACE
, DROP SPACE
, CREATE SNAPSHOT
, DROP SNAPSHOT
, BALANCE
, CONFIG
Caution
SHOW
operations are limited to the role of a user. For example, all users can run SHOW SPACES
, but the results only include the graph spaces that the users have privileges.SHOW USERS
and SHOW SNAPSHOTS
.This topic provides general suggestions for modeling data in NebulaGraph.
Note
The following suggestions may not apply to some special scenarios. In these cases, find help in the NebulaGraph community.
"},{"location":"8.service-tuning/2.graph-modeling/#model_for_performance","title":"Model for performance","text":"There is no perfect method to model in Nebula\u00a0Graph. Graph modeling depends on the questions that you want to know from the data. Your data drives your graph model. Graph data modeling is intuitive and convenient. Create your data model based on your business model. Test your model and gradually optimize it to fit your business. To get better performance, you can change or re-design your model multiple times.
"},{"location":"8.service-tuning/2.graph-modeling/#design_and_evaluate_the_most_important_queries","title":"Design and evaluate the most important queries","text":"Usually, various types of queries are validated in test scenarios to assess the overall capabilities of the system. However, in most production scenarios, there are not many types of frequently used queries. You can optimize the data model based on key queries selected according to the Pareto (80/20) principle.
"},{"location":"8.service-tuning/2.graph-modeling/#full-graph_scanning_avoidance","title":"Full-graph scanning avoidance","text":"Graph traversal can be performed after one or more vertices/edges are located through property indexes or VIDs. But for some query patterns, such as subgraph and path query patterns, the source vertex or edge of the traversal cannot be located through property indexes or VIDs. These queries find all the subgraphs that satisfy the query pattern by scanning the whole graph space which will have poor query performance. NebulaGraph does not implement indexing for the graph structures of subgraphs or paths.
"},{"location":"8.service-tuning/2.graph-modeling/#no_predefined_bonds_between_tags_and_edge_types","title":"No predefined bonds between Tags and Edge types","text":"Define the bonds between Tags and Edge types in the application, not NebulaGraph. There are no statements that could get the bonds between Tags and Edge types.
"},{"location":"8.service-tuning/2.graph-modeling/#tagsedge_types_predefine_a_set_of_properties","title":"Tags/Edge types predefine a set of properties","text":"While creating Tags or Edge types, you need to define a set of properties. Properties are part of the NebulaGraph Schema.
"},{"location":"8.service-tuning/2.graph-modeling/#control_changes_in_the_business_model_and_the_data_model","title":"Control changes in the business model and the data model","text":"Changes here refer to changes in business models and data models (meta-information), not changes in the data itself.
Some graph databases are designed to be Schema-free, so their data modeling, including the modeling of the graph topology and properties, can be very flexible. Properties can be re-modeled to graph topology, and vice versa. Such systems are often specifically optimized for graph topology access.
NebulaGraph master is a strong-Schema (row storage) system, which means that the business model should not change frequently. For example, the property Schema should not change. It is similar to avoiding ALTER TABLE
in MySQL.
On the contrary, vertices and their edges can be added or deleted at low costs. Thus, the easy-to-change part of the business model should be transformed to vertices or edges, rather than properties.
For example, in a business model, people have relatively fixed properties such as age, gender, and name. But their contact, place of visit, trade account, and login device are often changing. The former is suitable for modeling as properties and the latter as vertices or edges.
"},{"location":"8.service-tuning/2.graph-modeling/#set_temporary_properties_through_self-loop_edges","title":"Set temporary properties through self-loop edges","text":"As a strong Schema system, NebulaGraph does not support List-type properties. And using ALTER TAG
costs too much. If you need to add some temporary properties or List-type properties to a vertex, you can first create an edge type with the required properties, and then insert one or more edges that direct to the vertex itself. The figure is as follows.
To retrieve temporary properties of vertices, fetch from self-loop edges. For example:
//Create the edge type and insert the loop property.\nnebula> CREATE EDGE IF NOT EXISTS temp(tmp int);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@1:(1);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@2:(2);\nnebula> INSERT EDGE temp(tmp) VALUES \"player100\"->\"player100\"@3:(3);\n\n//After the data is inserted, you can query the loop property by general query statements, for example:\nnebula> GO FROM \"player100\" OVER temp YIELD properties(edge).tmp;\n+----------------------+\n| properties(EDGE).tmp |\n+----------------------+\n| 1 |\n| 2 |\n| 3 |\n+----------------------+\n\n//If you want the results to be returned in the form of a List, you can use a function, for example:\nnebula> MATCH (v1:player)-[e:temp]->() return collect(e.tmp);\n+----------------+\n| collect(e.tmp) |\n+----------------+\n| [1, 2, 3] |\n+----------------+\n
Operations on loops are not encapsulated with any syntactic sugars and you can use them just like those on normal edges."},{"location":"8.service-tuning/2.graph-modeling/#about_dangling_edges","title":"About dangling edges","text":"A dangling edge is an edge that only connects to a single vertex and only one part of the edge connects to the vertex.
In NebulaGraph master, dangling edges may appear in the following two cases.
Insert edges with INSERT EDGE statement before the source vertex or the destination vertex exists.
Delete vertices with DELETE VERTEX statement and the WITH EDGE
option is not used. At this time, the system does not delete the related outgoing and incoming edges of the vertices. There will be dangling edges by default.
Dangling edges may appear in NebulaGraph master as the design allow it to exist. And there is no MERGE statement like openCypher has. The existence of dangling edges depends entirely on the application level. You can use GO and LOOKUP statements to find a dangling edge, but cannot use the MATCH statement to find a dangling edge.
Examples:
// Insert an edge that connects two vertices which do not exist in the graph. The source vertex's ID is '11'. The destination vertex's ID is'13'. \n\nnebula> CREATE EDGE IF NOT EXISTS e1 (name string, age int);\nnebula> INSERT EDGE e1 (name, age) VALUES \"11\"->\"13\":(\"n1\", 1);\n\n// Query using the `GO` statement\n\nnebula> GO FROM \"11\" over e1 YIELD properties(edge);\n+----------------------+\n| properties(EDGE) |\n+----------------------+\n| {age: 1, name: \"n1\"} |\n+----------------------+\n\n// Query using the `LOOKUP` statement\n\nnebula> LOOKUP ON e1 YIELD EDGE AS r;\n+-------------------------------------------------------+\n| r |\n+-------------------------------------------------------+\n| [:e2 \"11\"->\"13\" @0 {age: 1, name: \"n1\"}] |\n+-------------------------------------------------------+\n\n// Query using the `MATCH` statement\n\nnebula> MATCH ()-[e:e1]->() RETURN e;\n+---+\n| e |\n+---+\n+---+\nEmpty set (time spent 3153/3573 us)\n
"},{"location":"8.service-tuning/2.graph-modeling/#breadth-first_traversal_over_depth-first_traversal","title":"Breadth-first traversal over depth-first traversal","text":"person
and add properties name
, age
, and eye_color
to it. If you create a tag eye_color
and an edge type has
, and then create an edge to represent the eye color owned by the person, the traversal performance will not be high.(src)-[edge {P1, P2}]->(dst)
as (src)-[edge1]->(i_node {P1, P2})-[edge2]->(dst)
. With NebulaGraph master, you can use (src)-[edge {P1, P2}]->(dst)
directly to decrease the depth of the traversal and increase the performance.To query in the opposite direction of an edge, use the following syntax:
(dst)<-[edge]-(src)
or GO FROM dst REVERSELY
.
If you do not care about the directions or want to query against both directions, use the following syntax:
(src)-[edge]-(dst)
or GO FROM src BIDIRECT
.
Therefore, there is no need to insert the same edge redundantly in the reversed direction.
"},{"location":"8.service-tuning/2.graph-modeling/#set_tag_properties_appropriately","title":"Set tag properties appropriately","text":"Put a group of properties that are on the same level into the same tag. Different groups represent different concepts.
"},{"location":"8.service-tuning/2.graph-modeling/#use_indexes_correctly","title":"Use indexes correctly","text":"Using property indexes helps find VIDs through properties, but can lead to great performance reduction. Only use an index when you need to find vertices or edges through their properties.
"},{"location":"8.service-tuning/2.graph-modeling/#design_vids_appropriately","title":"Design VIDs appropriately","text":"See VID.
"},{"location":"8.service-tuning/2.graph-modeling/#long_texts","title":"Long texts","text":"Do not use long texts to create edge properties. Edge properties are stored twice and long texts lead to greater write amplification. For how edges properties are stored, see Storage architecture. It is recommended to store long texts in HBase or Elasticsearch and store its address in NebulaGraph.
"},{"location":"8.service-tuning/2.graph-modeling/#dynamic_graphs_sequence_graphs_are_not_supported","title":"Dynamic graphs (sequence graphs) are not supported","text":"In some scenarios, graphs need to have the time information to describe how the structure of the entire graph changes over time.1
The Rank field on Edges in NebulaGraph master can be used to store time in int64, but no field on vertices can do this because if you store the time information as property values, it will be covered by new insertion. Thus NebulaGraph does not support sequence graphs.
"},{"location":"8.service-tuning/2.graph-modeling/#free_graph_data_modeling_tools","title":"Free graph data modeling tools","text":"arrows.app
https://blog.twitter.com/engineering/en_us/topics/insights/2021/temporal-graph-networks\u00a0\u21a9
INSERT
.COMPACTION
and BALANCE
jobs to optimize data format and storage distribution at the right time.Preheat on the application side:
NebulaGraph master applies rule-based execution plans. Users cannot change execution plans, pre-compile queries (and corresponding plan cache), or accelerate queries by specifying indexes.
To view the execution plan and executive summary, see EXPLAIN and PROFILE.
"},{"location":"8.service-tuning/compaction/","title":"Compaction","text":"This topic gives some information about compaction.
In NebulaGraph, Compaction
is the most important background process and has an important effect on performance.
Compaction
reads the data that is written on the hard disk, then re-organizes the data structure and the indexes, and then writes back to the hard disk. The read performance can increase by times after compaction. Thus, to get high read performance, trigger compaction
(full compaction
) manually when writing a large amount of data into Nebula\u00a0Graph.
Note
Note that compaction
leads to long-time hard disk IO. We suggest that users do compaction during off-peak hours (for example, early morning).
NebulaGraph has two types of compaction
: automatic compaction
and full compaction
.
compaction
","text":"Automatic compaction
is automatically triggered when the system reads data, writes data, or the system restarts. The read performance can increase in a short time. Automatic compaction
is enabled by default. But once triggered during peak hours, it can cause unexpected IO occupancy that has an unwanted effect on the performance.
compaction
","text":"Full compaction
enables large-scale background operations for a graph space such as merging files, deleting the data expired by TTL. This operation needs to be initiated manually. Use the following statements to enable full compaction
:
Note
We recommend you to do the full compaction during off-peak hours because full compaction has a lot of IO operations.
nebula> USE <your_graph_space>;\nnebula> SUBMIT JOB COMPACT;\n
The preceding statement returns the job ID. To show the compaction
progress, use the following statement:
nebula> SHOW JOB <job_id>;\n
"},{"location":"8.service-tuning/compaction/#operation_suggestions","title":"Operation suggestions","text":"These are some operation suggestions to keep Nebula\u00a0Graph performing well.
SUBMIT JOB COMPACT
.SUBMIT JOB COMPACT
periodically during off-peak hours (e.g. early morning).To control the write traffic limitation for compactions
, set the following parameter in the nebula-storaged.conf
configuration file.
Note
This parameter limits the rate of all writes including normal writes and compaction writes.
# Limit the write rate to 20MB/s.\n--rocksdb_rate_limit=20 (in MB/s)\n
Compaction
stored?\"","text":"By default, the logs are stored under the LOG
file in the /usr/local/nebula/data/storage/nebula/{1}/data/
directory, or similar to LOG.old.1625797988509303
. You can find the following content.
** Compaction Stats [default] **\nLevel Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop\n----------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n L0 2/0 2.46 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.53 0.51 2 0.264 0 0\n Sum 2/0 2.46 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.53 0.51 2 0.264 0 0\n Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0\n
If the number of L0
files is large, the read performance will be greatly affected and compaction can be triggered.
compactions
for multiple graph spaces at the same time?\"","text":"Yes, you can. But the IO is much larger at this time and the efficiency may be affected.
"},{"location":"8.service-tuning/compaction/#how_much_time_does_it_take_for_full_compactions","title":"\"How much time does it take for fullcompactions
?\"","text":"When rocksdb_rate_limit
is set to 20
, you can estimate the full compaction time by dividing the hard disk usage by the rocksdb_rate_limit
. If you do not set the rocksdb_rate_limit
value, the empirical value is around 50 MB/s.
--rocksdb_rate_limit
dynamically?\"","text":"No, you cannot.
"},{"location":"8.service-tuning/compaction/#can_i_stop_a_full_compaction_after_it_starts","title":"\"Can I stop a fullcompaction
after it starts?\"","text":"No, you cannot. When you start a full compaction, you have to wait till it is done. This is the limitation of RocksDB.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/","title":"Enable AutoFDO for NebulaGraph","text":"The AutoFDO can analyze the performance of an optimized program and use the program's performance information to guide the compiler to re-optimize the program. This document will help you to enable the AutoFDO for NebulaGraph.
More information about the AutoFDO, please refer AutoFDO Wiki.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#resource_preparations","title":"Resource Preparations","text":""},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#install_dependencies","title":"Install Dependencies","text":"Install perf
sudo apt-get update\nsudo apt-get install -y linux-tools-common \\\nlinux-tools-generic \\\nlinux-tools-`uname -r`\n
Install autofdo tool
sudo apt-get update\nsudo apt-get install -y autofdo\n
Or you can compile the autofdo tool from source.
For how to build NebulaGraph from source, please refer to the official document: Install NebulaGraph by compiling the source code. In the configure step, replace CMAKE_BUILD_TYPE=Release
with CMAKE_BUILD_TYPE=RelWithDebInfo
as below:
$ cmake -DCMAKE_INSTALL_PREFIX=/usr/local/nebula -DENABLE_TESTING=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo ..\n
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#prepare_test_data","title":"Prepare Test Data","text":"In our test environment, we use NebulaGraph Bench to prepare the test data and collect the profile data by running the FindShortestPath, Go1Step, Go2Step, Go3Step, InsertPersonScenario 5 scenarios.
Note
You can use your TopN queries in your production environment to collect the profile data, the performance can gain more in your environment.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#prepare_profile_data","title":"Prepare Profile Data","text":""},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#collect_perf_data_for_autofdo_tool","title":"Collect Perf Data For AutoFdo Tool","text":"After the test data preparation work done. Collect the perf data for different scenarios. Get the pid of storaged
, graphd
, metad
.
$ nebula.service status all\n[INFO] nebula-metad: Running as 305422, Listening on 9559\n[INFO] nebula-graphd: Running as 305516, Listening on 9669\n[INFO] nebula-storaged: Running as 305707, Listening on 9779\n
Start the perf record for nebula-graphd and nebula-storaged.
perf record -p 305516,305707 -b -e br_inst_retired.near_taken:pp -o ~/FindShortestPath.data\n
Note
Because the nebula-metad
service contribution percent is small compared with nebula-graphd
and nebula-storaged
services. To reduce effort, we didn't collect the perf data for nebula-metad
service.
Start the benchmark test for FindShortestPath scenario.
cd NebulaGraph-Bench \npython3 run.py stress run -s benchmark -scenario find_path.FindShortestPath -a localhost:9669 --args='-u 100 -i 100000'\n
After the benchmark finished, end the perf record by Ctrl + c.
Repeat above steps to collect corresponding profile data for the rest Go1Step, Go2Step, Go3Step and InsertPersonScenario scenarios.
create_gcov --binary=$NEBULA_HOME/bin/nebula-storaged \\\n--profile=~/FindShortestPath.data \\\n--gcov=~/FindShortestPath-storaged.gcov \\\n-gcov_version=1\n\ncreate_gcov --binary=$NEBULA_HOME/bin/nebula-graphd \\\n--profile=~/FindShortestPath.data \\\n--gcov=~/FindShortestPath-graphd.gcov \\\n-gcov_version=1\n
Repeat for Go1Step, Go2Step, Go3Step and InsertPersonScenario scenarios.
"},{"location":"8.service-tuning/enable_autofdo_for_nebulagraph/#merge_the_profile_data","title":"Merge the Profile Data","text":"profile_merger ~/FindShortestPath-graphd.gcov \\\n~/FindShortestPath-storaged.gcov \\\n~/go1step-storaged.gcov \\\n~/go1step-graphd.gcov \\\n~/go2step-storaged.gcov \\\n~/go2step-graphd.gcov \\\n~/go3step-storaged.gcov \\\n~/go3step-master-graphd.gcov \\\n~/InsertPersonScenario-storaged.gcov \\\n~/InsertPersonScenario-graphd.gcov\n
You will get a merged profile which is named fbdata.afdo
after that.
Recompile the GraphNebula Binary by passing the profile with compile option -fauto-profile
.
diff --git a/cmake/nebula/GeneralCompilerConfig.cmake b/cmake/nebula/GeneralCompilerConfig.cmake\n@@ -20,6 +20,8 @@ add_compile_options(-Wshadow)\n add_compile_options(-Wnon-virtual-dtor)\n add_compile_options(-Woverloaded-virtual)\n add_compile_options(-Wignored-qualifiers)\n+add_compile_options(-fauto-profile=~/fbdata.afdo)\n
Note
When you use multiple fbdata.afdo to compile multiple times, please remember to make clean
before re-compile, baucase only change the fbdata.afdo will not trigger re-compile.
You can use the SUBMIT JOB BALANCE
statement to balance the distribution of partitions and Raft leaders, or clear some Storage servers for easy maintenance. For details, see SUBMIT JOB BALANCE.
Danger
The BALANCE
commands migrate data and balance the distribution of partitions by creating and executing a set of subtasks. DO NOT stop any machine in the cluster or change its IP address until all the subtasks finish. Otherwise, the follow-up subtasks fail.
To balance the raft leaders, run SUBMIT JOB BALANCE LEADER
. It will start a job to balance the distribution of all the storage leaders in all graph spaces.
nebula> SUBMIT JOB BALANCE LEADER;\n
Run SHOW HOSTS
to check the balance result.
nebula> SHOW HOSTS;\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n| Host | Port | Status | Leader count | Leader distribution | Partition distribution | Version |\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n| \"192.168.10.101\" | 9779 | \"ONLINE\" | 8 | \"basketballplayer:3\" | \"basketballplayer:8\" | \"master\" |\n| \"192.168.10.102\" | 9779 | \"ONLINE\" | 3 | \"basketballplayer:3\" | \"basketballplayer:8\" | \"master\" |\n| \"192.168.10.103\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n| \"192.168.10.104\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n| \"192.168.10.105\" | 9779 | \"ONLINE\" | 0 | \"basketballplayer:2\" | \"basketballplayer:7\" | \"master\" |\n+------------------+------+----------+--------------+-----------------------------------+------------------------+----------------------+\n
Caution
During leader partition replica switching in NebulaGraph, the leader replicas will be temporarily prohibited from being written to until the switch is completed. If there are a large number of write requests during the switching period, it will result in a request error (Storage Error E_RPC_FAILURE
). See FAQ for error handling methods.
You can set the value of raft_heartbeat_interval_secs
in the Storage configuration file to control the timeout period for leader replica switching. For more information on the configuration file, see Storage configuration file.
NebulaGraph is used in a variety of industries. This topic presents a few best practices for using NebulaGraph. For more best practices, see Blog.
"},{"location":"8.service-tuning/practice/#scenarios","title":"Scenarios","text":"In graph theory, a super vertex, also known as a dense vertex, is a vertex with an extremely high number of adjacent edges. The edges can be outgoing or incoming.
Super vertices are very common because of the power-law distribution. For example, popular leaders in social networks (Internet celebrities), top stocks in the stock market, Big Four in the banking system, hubs in transportation networks, websites with high clicking rates on the Internet, and best sellers in E-commerce.
In NebulaGraph master, a vertex
and its properties
form a key-value pair
, with its VID
and other meta information as the key
. Its Out-Edge Key-Value
and In-Edge Key-Value
are stored in the same partition in the form of LSM-trees in hard disks and caches.
Therefore, directed traversals from this vertex
and directed traversals ending at this vertex
both involve either a large number of sequential IO scans
(ideally, after Compaction or a large number of random IO
(frequent writes to the vertex
and its ingoing and outgoing edges
).
As a rule of thumb, a vertex is considered dense when the number of its edges exceeds 10,000. Some special cases require additional consideration.
Note
In NebulaGraph master, there is not any data structure to store the out/in degree for each vertex. Therefore, there is no direct method to know whether it is a super vertex or not. You can try to use Spark to count the degrees periodically.
"},{"location":"8.service-tuning/super-node/#indexes_for_duplicate_properties","title":"Indexes for duplicate properties","text":"In a property graph, there is another class of cases similar to super vertices: a property has a very high duplication rate, i.e., many vertices with the same tag
but different VIDs
have identical property and property values.
Property indexes in NebulaGraph master are designed to reuse the functionality of RocksDB in the Storage Service, in which case indexes are modeled as keys with the same prefix
. If the lookup of a property fails to hit the cache, it is processed as a random seek and a sequential prefix scan on the hard disk to find the corresponding VID. After that, the graph is usually traversed from this vertex, so that another random read and sequential scan for the corresponding key-value of this vertex will be triggered. The higher the duplication rate, the larger the scan range.
For more information about property indexes, see How indexing works in NebulaGraph.
Usually, special design and processing are required when the number of duplicate property values exceeds 10,000.
"},{"location":"8.service-tuning/super-node/#suggested_solutions","title":"Suggested solutions","text":""},{"location":"8.service-tuning/super-node/#solutions_at_the_database_end","title":"Solutions at the database end","text":"Break up some of the super vertices according to their business significance:
Delete multiple edges and merge them into one.
For example, in the transfer scenario (Account_A)-[TRANSFER]->(Account_B)
, each transfer record is modeled as an edge between account A and account B, then there may be tens of thousands of transfer records between (Account_A)
and (Account_B)
.
In such scenarios, merge obsolete transfer details on a daily, weekly, or monthly basis. That is, batch-delete old edges and replace them with a small number of edges representing monthly total
and times
. And keep the transfer details of the latest month.
Split an edge into multiple edges of different types.
For example, in the (Airport)<-[DEPART]-(Flight)
scenario, the departure of each flight is modeled as an edge between a flight and an airport. Departures from a big airport might be enormous.
According to different airlines, divide the DEPART
edge type into finer edge types, such as DEPART_CEAIR
, DEPART_CSAIR
, etc. Specify the departing airline in queries (graph traversal).
Split vertices.
For example, in the loan network (person)-[BORROW]->(bank)
, large bank A will have a very large number of loans and borrowers.
In such scenarios, you can split the large vertex A into connected sub-vertices A1, A2, and A3.
(Person1)-[BORROW]->(BankA1), (Person2)-[BORROW]->(BankA2), (Person2)-[BORROW]->(BankA3);\n(BankA1)-[BELONGS_TO]->(BankA), (BankA2)-[BELONGS_TO]->(BankA), (BankA3)-[BELONGS_TO]->(BankA).\n
A1, A2, and A3 can either be three real branches of bank A, such as Beijing branch, Shanghai branch, and Zhejiang branch, or three virtual branches set up according to certain rules, such as A1: 1-1000, A2: 1001-10000 and A3: 10000+
according to the number of loans. In this way, any operation on A is converted into three separate operations on A1, A2, and A3.
NebulaGraph supports using snapshots to back up and restore data. When data loss or misoperation occurs, the data will be restored through the snapshot.
"},{"location":"backup-and-restore/3.manage-snapshot/#prerequisites","title":"Prerequisites","text":"NebulaGraph authentication is disabled by default. In this case, all users can use the snapshot feature.
If authentication is enabled, only the GOD role user can use the snapshot feature. For more information about roles, see Roles and privileges.
"},{"location":"backup-and-restore/3.manage-snapshot/#precautions","title":"Precautions","text":"ADD HOST
, DROP HOST
, CREATE SPACE
, DROP SPACE
, and BALANCE
are performed.DROP SNAPSHOT
.Run CREATE SNAPSHOT
to create a snapshot for all the graph spaces based on the current time for NebulaGraph. Creating a snapshot for a specific graph space is not supported yet.
Note
If the creation fails, refer to the later section to delete the corrupted snapshot and then recreate the snapshot.
nebula> CREATE SNAPSHOT;\n
"},{"location":"backup-and-restore/3.manage-snapshot/#view_snapshots","title":"View snapshots","text":"To view all existing snapshots, run SHOW SNAPSHOTS
.
nebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+------------------+\n| \"SNAPSHOT_2021_03_09_08_43_12\" | \"VALID\" | \"127.0.0.1:9779\" |\n| \"SNAPSHOT_2021_03_09_09_10_52\" | \"VALID\" | \"127.0.0.1:9779\" |\n+--------------------------------+---------+------------------+\n
The parameters in the return information are described as follows.
Parameter DescriptionName
The name of the snapshot directory. The prefix SNAPSHOT
indicates that the file is a snapshot file, and the suffix indicates the time the snapshot was created (UTC). Status
The status of the snapshot. VALID
indicates that the creation succeeded, while INVALID
indicates that it failed. Hosts
The IPs (or hostnames) and ports of all Storage servers at the time the snapshot was created."},{"location":"backup-and-restore/3.manage-snapshot/#snapshot_path","title":"Snapshot path","text":"Snapshots are stored in the path specified by the data_path
parameter in the Meta and Storage configuration files. When a snapshot is created, the checkpoints
directory is checked in the datastore path of the leader Meta service and all Storage services for the existence, and if it is not there, it is automatically created. The newly created snapshot is stored as a subdirectory within the checkpoints
directory. For example, SNAPSHOT_2021_03_09_08_43_12
. The suffix 2021_03_09_08_43_12
is generated automatically based on the creation time (UTC).
To fast locate the path where the snapshots are stored, you can use the Linux command find
in the datastore path. For example:
$ cd /usr/local/nebula-graph-ent-master/data\n$ find |grep 'SNAPSHOT_2021_03_09_08_43_12'\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12/data\n./data/meta2/nebula/0/checkpoints/SNAPSHOT_2021_03_09_08_43_12/data/000081.sst\n...\n
"},{"location":"backup-and-restore/3.manage-snapshot/#delete_snapshots","title":"Delete snapshots","text":"To delete a snapshot with the given name, run DROP SNAPSHOT
.
DROP SNAPSHOT <snapshot_name>;\n
Example:
nebula> DROP SNAPSHOT SNAPSHOT_2021_03_09_08_43_12;\nnebula> SHOW SNAPSHOTS;\n+--------------------------------+---------+------------------+\n| Name | Status | Hosts |\n+--------------------------------+---------+------------------+\n| \"SNAPSHOT_2021_03_09_09_10_52\" | \"VALID\" | \"127.0.0.1:9779\" |\n+--------------------------------+---------+------------------+\n
Note
Deleting the only snapshot within the checkpoints
directory also deletes the checkpoints
directory.
Warning
When you restore data with snapshots, make sure that the graph spaces backed up in the snapshot have not been dropped. Otherwise, the data of the graph spaces cannot be restored.
Currently, there is no command to restore data with snapshots. You need to manually copy the snapshot file to the corresponding folder, or you can make it by using a shell script. The logic implements as follows:
After the snapshot is created, the checkpoints
directory is generated in the installation directory of the leader Meta service and all Storage services, and saves the created snapshot. Taking this topic as an example, when there are two graph spaces, the snapshots created are saved in /usr/local/nebula/data/meta/nebula/0/checkpoints
, /usr/local/nebula/data/storage/ nebula/3/checkpoints
and /usr/local/nebula/data/storage/nebula/4/checkpoints
.
$ ls /usr/local/nebula/data/meta/nebula/0/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n$ ls /usr/local/nebula/data/storage/nebula/3/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n$ ls /usr/local/nebula/data/storage/nebula/4/checkpoints/\nSNAPSHOT_2021_03_09_09_10_52\n
To restore the lost data through snapshots, you can take a snapshot at an appropriate time, copy the folders data
and wal
in the corresponding snapshot directory to its parent directory (at the same level with checkpoints
) to overwrite the previous data
and wal
, and then restart the cluster.
Warning
The data and wal directories of all Meta services should be overwritten at the same time. Otherwise, the new leader Meta service will use the latest Meta data after a cluster is restarted.
Backup & Restore (BR for short) is a Command-Line Interface (CLI) tool to back up data of graph spaces of NebulaGraph and to restore data from the backup files.
"},{"location":"backup-and-restore/nebula-br/1.what-is-br/#features","title":"Features","text":"The BR has the following features. It supports:
To use the BR, follow these steps:
This topic introduces the installation of BR in bare-metal deployment scenarios.
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#notes","title":"Notes","text":"To use the BR (Community Edition) tool, you need to install the NebulaGraph Agent service, which is taken as a daemon for each machine in the cluster that starts and stops the NebulaGraph service, and uploads and downloads backup files. The BR (Community Edition) tool and the Agent plug-in are installed as described below.
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#version_compatibility","title":"Version compatibility","text":"NebulaGraph BR Agent 3.5.x ~ 3.6.0 3.6.0 3.6.x ~ 3.7.0 3.3.0 ~ 3.4.x 3.3.0 0.2.0 ~ 3.4.0 3.0.x ~ 3.2.x 0.6.1 0.1.0 ~ 0.2.0"},{"location":"backup-and-restore/nebula-br/2.compile-br/#install_br_with_a_binary_file","title":"Install BR with a binary file","text":"Install BR.
wget https://github.com/vesoft-inc/nebula-br/releases/download/v3.6.0/br-3.6.0-linux-amd64\n
Change the binary file name to br
.
sudo mv br-3.6.0-linux-amd64 br\n
Grand execute permission to BR.
sudo chmod +x br\n
Run ./br version
to check BR version.
[nebula-br]$ ./br version\nNebula Backup And Restore Utility Tool,V-3.6.0\n
Before compiling the BR, do a check of these:
To compile the BR, follow these steps:
Clone the nebula-br
repository to your machine.
git clone https://github.com/vesoft-inc/nebula-br.git\n
Change to the br
directory.
cd nebula-br\n
Compile the BR.
make\n
Users can enter bin/br version
on the command line. If the following results are returned, the BR is compiled successfully.
[nebula-br]$ bin/br version\nNebulaGraph Backup And Restore Utility Tool,V-3.6.0\n
"},{"location":"backup-and-restore/nebula-br/2.compile-br/#install_agent","title":"Install Agent","text":"NebulaGraph Agent is installed as a binary file in each machine and serves the BR tool with the RPC protocol.
In each machine, follow these steps:
Install Agent.
wget https://github.com/vesoft-inc/nebula-agent/releases/download/v3.7.0/agent-3.7.0-linux-amd64\n
Rename the Agent file to agent
.
sudo mv agent-3.7.0-linux-amd64 agent\n
Add execute permission to Agent.
sudo chmod +x agent\n
Start Agent.
Note
Before starting Agent, make sure that the Meta service has been started and Agent has read and write access to the corresponding NebulaGraph cluster directory and backup directory.
sudo nohup ./agent --agent=\"<agent_node_ip>:8888\" --meta=\"<metad_node_ip>:9559\" --ratelimit=<file_size_bt> > nebula_agent.log 2>&1 &\n
--agent
: The IP address and port number of Agent.--meta
: The IP address and access port of any Meta service in the cluster.--ratelimit
: (Optional) Limits the speed of file uploads and downloads to prevent bandwidth from being filled up and making other services unavailable. Unit: Bytes.For example:
sudo nohup ./agent --agent=\"192.168.8.129:8888\" --meta=\"192.168.8.129:9559\" --ratelimit=1048576 > nebula_agent.log 2>&1 &\n
Caution
The IP address format for --agent
should be the same as that of Meta and Storage services set in the configuration files. That is, use the real IP addresses or use 127.0.0.1
. Otherwise Agent does not run.
Log into NebulaGraph and then run the following command to view the status of Agent.
nebula> SHOW HOSTS AGENT;\n+-----------------+------+----------+---------+--------------+---------+\n| Host | Port | Status | Role | Git Info Sha | Version |\n+-----------------+------+----------+---------+--------------+---------+\n| \"192.168.8.129\" | 8888 | \"ONLINE\" | \"AGENT\" | \"96646b8\" | |\n+-----------------+------+----------+---------+--------------+---------+ \n
If you encounter E_LIST_CLUSTER_NO_AGENT_FAILURE
error, it may be due to the Agent service is not started or the Agent service is not registered to Meta service. First, execute SHOW HOSTS AGENT
to check the status of the Agent service on all nodes in the cluster, when the status shows OFFLINE
, it means the registration of Agent failed, then check whether the value of the --meta
option in the command to start the Agent service is correct.
After the BR is installed, you can back up data of the entire graph space. This topic introduces how to use the BR to back up data.
"},{"location":"backup-and-restore/nebula-br/3.br-backup-data/#prerequisites","title":"Prerequisites","text":"To back up data with the BR, do a check of these:
If you store the backup files locally, create a directory with the same absolute path on the meta servers, the storage servers, and the BR machine for the backup files and get the absolute path. Make sure the account has write privileges for this directory.
Warning
In the production environment, we recommend that you mount Network File System (NFS) storage to the meta servers, the storage servers, and the BR machine for local backup, or use Amazon S3 or Alibaba Cloud OSS for remote backup. When you restore the data from local files, you must manually move these backup files to a specified directory, which causes redundant data and troubles. For more information, see Restore data from backup files.
In the BR installation directory (the default path of the compiled BR is ./bin/br
), run the following command to perform a full backup for the entire cluster.
Note
Make sure that the local path where the backup file is stored exists.
$ ./br backup full --meta <ip_address> --storage <storage_path>\n
For example:
Run the following command to perform a full backup for the entire cluster whose meta service address is 192.168.8.129:9559
, and save the backup file to /home/nebula/backup/
.
Caution
If there are multiple metad addresses, you can use any one of them.
Caution
If you back up data to a local disk, only the data of the leader metad is backed up by default. So if there are multiple metad processes, you need to manually copy the directory of the leader metad (path <storage_path>/meta
) and overwrite the corresponding directory of other follower meatd processes.
$ ./br backup full --meta \"192.168.8.129:9559\" --storage \"local:///home/nebula/backup/\"\n
Run the following command to perform a full backup for the entire cluster whose meta service address is 192.168.8.129:9559
, and save the backup file to backup
in the br-test
bucket of the object storage service compatible with S3 protocol.
$ ./br backup full --meta \"192.168.8.129:9559\" --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=default\n
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. --debug
- No None Checks for more log information. --log
string No \"br.log\"
Specifies detailed log path for restoration and backup. --meta
string Yes None The IP address and port of the meta service. --space
string Yes None (Experimental feature) Specifies the names of the spaces to be backed up. All spaces will be backed up if not specified. Multiple spaces can be specified, and format is --spaces nba_01 --spaces nba_02
. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID."},{"location":"backup-and-restore/nebula-br/3.br-backup-data/#next_to_do","title":"Next to do","text":"After the backup files are generated, you can use the BR to restore them for NebulaGraph. For more information, see Use BR to restore data.
"},{"location":"backup-and-restore/nebula-br/4.br-restore-data/","title":"Use BR to restore data","text":"If you use the BR to back up data, you can use it to restore the data to NebulaGraph. This topic introduces how to use the BR to restore data from backup files.
Caution
During the restoration process, the data on the target NebulaGraph cluster is removed and then is replaced with the data from the backup files. If necessary, back up the data on the target cluster.
Caution
The restoration process is performed OFFLINE.
"},{"location":"backup-and-restore/nebula-br/4.br-restore-data/#prerequisites","title":"Prerequisites","text":"In the BR installation directory (the default path of the compiled BR is ./br
), run the following command to perform a full backup for the entire cluster.
Users can use the following command to list the existing backup information:
$ ./br show --storage <storage_path>\n
For example, run the following command to list the backup information in the local /home/nebula/backup
path. $ ./br show --storage \"local:///home/nebula/backup\"\n+----------------------------+---------------------+------------------------+-------------+------------+\n| NAME | CREATE TIME | SPACES | FULL BACKUP | ALL SPACES |\n+----------------------------+---------------------+------------------------+-------------+------------+\n| BACKUP_2022_02_10_07_40_41 | 2022-02-10 07:40:41 | basketballplayer | true | true |\n| BACKUP_2022_02_11_08_26_43 | 2022-02-11 08:26:47 | basketballplayer,foesa | true | true |\n+----------------------------+---------------------+------------------------+-------------+------------+\n
Or, you can run the following command to list the backup information stored in S3 URL s3://192.168.8.129:9000/br-test/backup
.
$ ./br show --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=default\n
Parameter Data type Required Default value Description -h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. --storage
string Yes None The target storage URL of BR backup data. The format is: <Schema>://<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. Run the following command to restore data.
$ ./br restore full --meta <ip_address> --storage <storage_path> --name <backup_name>\n
For example, run the following command to upload the backup files from the local /home/nebula/backup/
to the cluster where the meta service's address is 192.168.8.129:9559
.
$ ./br restore full --meta \"192.168.8.129:9559\" --storage \"local:///home/nebula/backup/\" --name BACKUP_2021_12_08_18_38_08\n
Or, you can run the following command to upload the backup files from the S3 URL s3://192.168.8.129:9000/br-test/backup
.
$ ./br restore full --meta \"192.168.8.129:9559\" --s3.endpoint \"http://192.168.8.129:9000\" --storage=\"s3://br-test/backup/\" --s3.access_key=minioadmin --s3.secret_key=minioadmin --s3.region=\"default\" --name BACKUP_2021_12_08_18_38_08\n
If the following information is returned, the data is restored successfully.
Restore succeed.\n
Caution
If your new cluster hosts' IPs are not all the same as the backup cluster, after restoration, you should run add hosts
to add the Storage host IPs in the new cluster one by one.
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. -meta
string Yes None The IP address and port of the meta service. -name
string Yes None The name of backup. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. Run the following command to clean up temporary files if any error occurred during backup. It will clean the files in cluster and external storage. You could also use it to clean up old backups files in external storage.
$ ./br cleanup --meta <ip_address> --storage <storage_path> --name <backup_name>\n
The parameters are as follows.
Parameter Data type Required Default value Description-h,-help
- No None Checks help for restoration. -debug
- No None Checks for more log information. -log
string No \"br.log\"
Specifies detailed log path for restoration and backup. -meta
string Yes None The IP address and port of the meta service. -name
string Yes None The name of backup. --storage
string Yes None The target storage URL of BR backup data. The format is: \\<Schema>://\\<PATH>. Schema: Optional values are local
and s3
. When selecting s3, you need to fill in s3.access_key
, s3.endpoint
, s3.region
, and s3.secret_key
.PATH: The path of the storage location. --s3.access_key
string No None Sets AccessKey ID. --s3.endpoint
string No None Sets the S3 endpoint URL, please specify the HTTP or HTTPS scheme explicitly. --s3.region
string No None Sets the region or location to upload or download the backup. --s3.secret_key
string No None Sets SecretKey for AccessKey ID. NebulaGraph Flink Connector is a connector that helps Flink users quickly access NebulaGraph. NebulaGraph Flink Connector supports reading data from the NebulaGraph database or writing other external data to the NebulaGraph database.
For more information, see NebulaGraph Flink Connector.
"},{"location":"connector/nebula-flink-connector/#use_cases","title":"Use cases","text":"NebulaGraph Flink Connector applies to the following scenarios:
Release
"},{"location":"connector/nebula-flink-connector/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Flink Connector version and the NebulaGraph core version is as follows.
Flink Connector version NebulaGraph version 3.0-SNAPSHOT nightly 3.5.0 3.x.x 3.3.0 3.x.x 3.0.0 3.x.x 2.6.1 2.6.0, 2.6.1 2.6.0 2.6.0, 2.6.1 2.5.0 2.5.0, 2.5.1 2.0.0 2.0.0, 2.0.1"},{"location":"connector/nebula-flink-connector/#prerequisites","title":"Prerequisites","text":"Add the following dependency to the Maven configuration file pom.xml
to automatically obtain the Flink Connector.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>nebula-flink-connector</artifactId>\n <version>3.5.0</version>\n</dependency>\n
"},{"location":"connector/nebula-flink-connector/#compile_and_package","title":"Compile and package","text":"Follow the steps below to compile and package the Flink Connector.
Clone repository nebula-flink-connector
.
$ git clone -b release-3.5 https://github.com/vesoft-inc/nebula-flink-connector.git\n
Enter the nebula-flink-connector
directory.
Compile and package.
$ mvn clean package -Dmaven.test.skip=true\n
After compilation, a file similar to nebula-flink-connector-3.5.0.jar
is generated in the directory connector/target
of the folder.
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();\nNebulaClientOptions nebulaClientOptions = new NebulaClientOptions.NebulaClientOptionsBuilder()\n .setGraphAddress(\"127.0.0.1:9669\")\n .setMetaAddress(\"127.0.0.1:9559\")\n .build();\nNebulaGraphConnectionProvider graphConnectionProvider = new NebulaGraphConnectionProvider(nebulaClientOptions);\nNebulaMetaConnectionProvider metaConnectionProvider = new NebulaMetaConnectionProvider(nebulaClientOptions);\n\nVertexExecutionOptions executionOptions = new VertexExecutionOptions.ExecutionOptionBuilder()\n .setGraphSpace(\"flinkSink\")\n .setTag(\"player\")\n .setIdIndex(0)\n .setFields(Arrays.asList(\"name\", \"age\"))\n .setPositions(Arrays.asList(1, 2))\n .setBatchSize(2)\n .build();\n\nNebulaVertexBatchOutputFormat outputFormat = new NebulaVertexBatchOutputFormat(\n graphConnectionProvider, metaConnectionProvider, executionOptions);\nNebulaSinkFunction<Row> nebulaSinkFunction = new NebulaSinkFunction<>(outputFormat);\nDataStream<Row> dataStream = playerSource.map(row -> {\n Row record = new org.apache.flink.types.Row(row.size());\n for (int i = 0; i < row.size(); i++) {\n record.setField(i, row.get(i));\n }\n return record;\n });\ndataStream.addSink(nebulaSinkFunction);\nenv.execute(\"write nebula\")\n
"},{"location":"connector/nebula-flink-connector/#read_data_from_nebulagraph","title":"Read data from NebulaGraph","text":"NebulaClientOptions nebulaClientOptions = new NebulaClientOptions.NebulaClientOptionsBuilder()\n .setMetaAddress(\"127.0.0.1:9559\")\n .build();\nstorageConnectionProvider = new NebulaStorageConnectionProvider(nebulaClientOptions);\nStreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();\nenv.setParallelism(1);\n\nVertexExecutionOptions vertexExecutionOptions = new VertexExecutionOptions.ExecutionOptionBuilder()\n .setGraphSpace(\"flinkSource\")\n .setTag(\"person\")\n .setNoColumn(false)\n .setFields(Arrays.asList())\n .setLimit(100)\n .build();\n\nNebulaSourceFunction sourceFunction = new NebulaSourceFunction(storageConnectionProvider)\n .setExecutionOptions(vertexExecutionOptions);\nDataStreamSource<BaseTableRow> dataStreamSource = env.addSource(sourceFunction);\ndataStreamSource.map(row -> {\n List<ValueWrapper> values = row.getValues();\n Row record = new Row(15);\n record.setField(0, values.get(0).asLong());\n record.setField(1, values.get(1).asString());\n record.setField(2, values.get(2).asString());\n record.setField(3, values.get(3).asLong());\n record.setField(4, values.get(4).asLong());\n record.setField(5, values.get(5).asLong());\n record.setField(6, values.get(6).asLong());\n record.setField(7, values.get(7).asDate());\n record.setField(8, values.get(8).asDateTime().getUTCDateTimeStr());\n record.setField(9, values.get(9).asLong());\n record.setField(10, values.get(10).asBoolean());\n record.setField(11, values.get(11).asDouble());\n record.setField(12, values.get(12).asDouble());\n record.setField(13, values.get(13).asTime().getUTCTimeStr());\n record.setField(14, values.get(14).asGeography());\n return record;\n}).print();\nenv.execute(\"NebulaStreamSource\");\n
"},{"location":"connector/nebula-flink-connector/#parameter_descriptions","title":"Parameter descriptions","text":"NebulaClientOptions
is the configuration for connecting to NebulaGraph, as described below.
setGraphAddress
String Yes The Graph service address of NebulaGraph. setMetaAddress
String Yes The Meta service address of NebulaGraph. VertexExecutionOptions
is the configuration for reading vertices from and writing vertices to NebulaGraph, as described below.
setGraphSpace
String Yes The graph space name. setTag
String Yes The tag name. setIdIndex
Int Yes The subscript of the stream data field that is used as the VID when writing data to NebulaGraph. setFields
List Yes A collection of the property names of a tag. It is used to write data to or read data from NebulaGraph. Make sure the setNoColumn
is false
when reading data; otherwise, the configuration is invalid. If this parameter is empty, all properties are read when reading data from NebulaGraph. setPositions
List Yes A collection of the subscripts of the stream data fields. It indicates that the corresponding field values are written to NebulaGraph as property values. This parameter needs to correspond to setFields
. setBatchSize
String No The maximum number of data records to write to NebulaGraph at a time. The default value is 2000
. setNoColumn
String No The properties are not to be read if set to true
when reading data. The default value is false
. setLimit
String No The maximum number of data records to pull at a time when reading data. The default value is 2000
. EdgeExecutionOptions
is the configuration for reading edges from and writing edges to NebulaGraph, as described below.
setGraphSpace
String Yes The graph space name. setEdge
String Yes The edge type name. setSrcIndex
Int Yes The subscript of the stream data field that is used as the VID of the source vertex when writing data to NebulaGraph. setDstIndex
Int Yes The subscript of the stream data field that is used as the VID of the destination vertex when writing data to NebulaGraph. setRankIndex
Int Yes The subscript of the stream data field that is used as the rank of the edge when writing data to NebulaGraph. setFields
List Yes A collection of the property names of an edge type. It is used to write data to or read data from NebulaGraph. Make sure the setNoColumn
is false
when reading data; otherwise, the configuration is invalid. If this parameter is empty, all properties are read when reading data from NebulaGraph. setPositions
List Yes A collection of the subscripts of the stream data fields. It indicates that the corresponding field values are written to NebulaGraph as property values. This parameter needs to correspond to setFields
. setBatchSize
String No The maximum number of data records to write to NebulaGraph at a time. The default value is 2000
. setNoColumn
String No The properties are not to be read if set to true
when reading data. The default value is false
. setLimit
String No The maximum number of data records to pull at a time when reading data. The default value is 2000
. Create a graph space.
NebulaCatalog nebulaCatalog = NebulaCatalogUtils.createNebulaCatalog(\n \"NebulaCatalog\",\n \"default\",\n \"root\",\n \"nebula\",\n \"127.0.0.1:9559\",\n \"127.0.0.1:9669\");\n\nEnvironmentSettings settings = EnvironmentSettings.newInstance()\n .inStreamingMode()\n .build();\nTableEnvironment tableEnv = TableEnvironment.create(settings);\n\ntableEnv.registerCatalog(CATALOG_NAME, nebulaCatalog);\ntableEnv.useCatalog(CATALOG_NAME);\n\nString createDataBase = \"CREATE DATABASE IF NOT EXISTS `db1`\"\n + \" COMMENT 'space 1'\"\n + \" WITH (\"\n + \" 'partition_num' = '100',\"\n + \" 'replica_factor' = '3',\"\n + \" 'vid_type' = 'FIXED_STRING(10)'\"\n + \")\";\ntableEnv.executeSql(createDataBase);\n
Create a tag.
tableEnvironment.executeSql(\"CREATE TABLE `person` (\"\n + \" vid BIGINT,\"\n + \" col1 STRING,\"\n + \" col2 STRING,\"\n + \" col3 BIGINT,\"\n + \" col4 BIGINT,\"\n + \" col5 BIGINT,\"\n + \" col6 BIGINT,\"\n + \" col7 DATE,\"\n + \" col8 TIMESTAMP,\"\n + \" col9 BIGINT,\"\n + \" col10 BOOLEAN,\"\n + \" col11 DOUBLE,\"\n + \" col12 DOUBLE,\"\n + \" col13 TIME,\"\n + \" col14 STRING\"\n + \") WITH (\"\n + \" 'connector' = 'nebula',\"\n + \" 'meta-address' = '127.0.0.1:9559',\"\n + \" 'graph-address' = '127.0.0.1:9669',\"\n + \" 'username' = 'root',\"\n + \" 'password' = 'nebula',\"\n + \" 'data-type' = 'vertex',\"\n + \" 'graph-space' = 'flink_test',\"\n + \" 'label-name' = 'person'\"\n + \")\"\n);\n
Create an edge type.
tableEnvironment.executeSql(\"CREATE TABLE `friend` (\"\n + \" sid BIGINT,\"\n + \" did BIGINT,\"\n + \" rid BIGINT,\"\n + \" col1 STRING,\"\n + \" col2 STRING,\"\n + \" col3 BIGINT,\"\n + \" col4 BIGINT,\"\n + \" col5 BIGINT,\"\n + \" col6 BIGINT,\"\n + \" col7 DATE,\"\n + \" col8 TIMESTAMP,\"\n + \" col9 BIGINT,\"\n + \" col10 BOOLEAN,\"\n + \" col11 DOUBLE,\"\n + \" col12 DOUBLE,\"\n + \" col13 TIME,\"\n + \" col14 STRING\"\n + \") WITH (\"\n + \" 'connector' = 'nebula',\"\n + \" 'meta-address' = '127.0.0.1:9559',\"\n + \" 'graph-address' = '127.0.0.1:9669',\"\n + \" 'username' = 'root',\"\n + \" 'password' = 'nebula',\"\n + \" 'graph-space' = 'flink_test',\"\n + \" 'label-name' = 'friend',\"\n + \" 'data-type'='edge',\"\n + \" 'src-id-index'='0',\"\n + \" 'dst-id-index'='1',\"\n + \" 'rank-id-index'='2'\"\n + \")\"\n);\n
Queries the data of an edge type and inserts it into another edge type.
Table table = tableEnvironment.sqlQuery(\"SELECT * FROM `friend`\");\ntable.executeInsert(\"`friend_sink`\").await();\n
NebulaGraph Spark Connector is a Spark connector application for reading and writing NebulaGraph data in Spark standard format. NebulaGraph Spark Connector consists of two parts: Reader and Writer.
Reader
Provides a Spark SQL interface. This interface can be used to read NebulaGraph data. It reads one vertex or edge type data at a time and assemble the result into a Spark DataFrame.
Writer
Provides a Spark SQL interface. This interface can be used to write DataFrames into NebulaGraph in a row-by-row or batch-import way.
For more information, see NebulaGraph Spark Connector.
"},{"location":"connector/nebula-spark-connector/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Spark Connector version, the NebulaGraph core version and the Spark version is as follows.
Spark Connector version NebulaGraph version Spark version nebula-spark-connector_3.0-3.0-SNAPSHOT.jar nightly 3.x nebula-spark-connector_2.2-3.0-SNAPSHOT.jar nightly 2.2.x nebula-spark-connector-3.0-SNAPSHOT.jar nightly 2.4.x nebula-spark-connector_3.0-3.6.0.jar 3.x 3.x nebula-spark-connector_2.2-3.6.0.jar 3.x 2.2.x nebula-spark-connector-3.6.0.jar 3.x 2.4.x nebula-spark-connector_2.2-3.4.0.jar 3.x 2.2.x nebula-spark-connector-3.4.0.jar 3.x 2.4.x nebula-spark-connector_2.2-3.3.0.jar 3.x 2.2.x nebula-spark-connector-3.3.0.jar 3.x 2.4.x nebula-spark-connector-3.0.0.jar 3.x 2.4.x nebula-spark-connector-2.6.1.jar 2.6.0, 2.6.1 2.4.x nebula-spark-connector-2.6.0.jar 2.6.0, 2.6.1 2.4.x nebula-spark-connector-2.5.1.jar 2.5.0, 2.5.1 2.4.x nebula-spark-connector-2.5.0.jar 2.5.0, 2.5.1 2.4.x nebula-spark-connector-2.1.0.jar 2.0.0, 2.0.1 2.4.x nebula-spark-connector-2.0.1.jar 2.0.0, 2.0.1 2.4.x nebula-spark-connector-2.0.0.jar 2.0.0, 2.0.1 2.4.x"},{"location":"connector/nebula-spark-connector/#use_cases","title":"Use cases","text":"NebulaGraph Spark Connector applies to the following scenarios:
The features of NebulaGraph Spark Connector 3.6.0 are as follows:
insert
, update
and delete
, are supported. insert
mode will insert (overwrite) data, update
mode will only update existing data, and delete
mode will only delete data.Release
"},{"location":"connector/nebula-spark-connector/#get_nebulagraph_spark_connector","title":"Get NebulaGraph Spark Connector","text":""},{"location":"connector/nebula-spark-connector/#compile_and_package","title":"Compile and package","text":"Clone repository nebula-spark-connector
.
$ git clone -b release-3.6 https://github.com/vesoft-inc/nebula-spark-connector.git\n
Enter the nebula-spark-connector
directory.
Compile and package. The procedure varies with Spark versions.
Note
Spark of the corresponding version has been installed.
- Spark 2.4
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector -am -Pscala-2.11 -Pspark-2.4\n```\n
- Spark 2.2
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector_2.2 -am -Pscala-2.11 -Pspark-2.2\n```\n
- Spark 3.x
```bash\n$ mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true -pl nebula-spark-connector_3.0 -am -Pscala-2.12 -Pspark-3.0\n```\n
After compilation, a file similar to nebula-spark-connector-3.6.0-SHANPSHOT.jar
is generated in the directory target
of the folder.
Download
"},{"location":"connector/nebula-spark-connector/#how_to_use","title":"How to use","text":"When using NebulaGraph Spark Connector to reading and writing NebulaGraph data, You can refer to the following code.
# Read vertex and edge data from NebulaGraph.\nspark.read.nebula().loadVerticesToDF()\nspark.read.nebula().loadEdgesToDF()\n\n# Write dataframe data into NebulaGraph as vertex and edges.\ndataframe.write.nebula().writeVertices()\ndataframe.write.nebula().writeEdges()\n
nebula()
receives two configuration parameters, including connection configuration and read-write configuration.
Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
"},{"location":"connector/nebula-spark-connector/#reading_data_from_nebulagraph","title":"Reading data from NebulaGraph","text":"val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withConenctionRetry(2)\n .withExecuteRetry(2)\n .withTimeout(6000)\n .build()\n\nval nebulaReadVertexConfig: ReadNebulaConfig = ReadNebulaConfig\n .builder()\n .withSpace(\"test\")\n .withLabel(\"person\")\n .withNoColumn(false)\n .withReturnCols(List(\"birthday\"))\n .withLimit(10)\n .withPartitionNum(10)\n .build()\nval vertex = spark.read.nebula(config, nebulaReadVertexConfig).loadVerticesToDF()\n\nval nebulaReadEdgeConfig: ReadNebulaConfig = ReadNebulaConfig\n .builder()\n .withSpace(\"test\")\n .withLabel(\"knows\")\n .withNoColumn(false)\n .withReturnCols(List(\"degree\"))\n .withLimit(10)\n .withPartitionNum(10)\n .build()\nval edge = spark.read.nebula(config, nebulaReadEdgeConfig).loadEdgesToDF()\n
NebulaConnectionConfig
is the configuration for connecting to NebulaGraph, as described below.
withMetaAddress
Yes Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. Read data is no need to configure withGraphAddress
. withConnectionRetry
No The number of retries that the NebulaGraph Java Client connected to NebulaGraph. The default value is 1
. withExecuteRetry
No The number of retries that the NebulaGraph Java Client executed query statements. The default value is 1
. withTimeout
No The timeout for the NebulaGraph Java Client request response. The default value is 6000
, Unit: ms. ReadNebulaConfig
is the configuration to read NebulaGraph data, as described below.
withSpace
Yes NebulaGraph space name. withLabel
Yes The Tag or Edge type name within the NebulaGraph space. withNoColumn
No Whether the property is not read. The default value is false
, read property. If the value is true
, the property is not read, the withReturnCols
configuration is invalid. withReturnCols
No Configures the set of properties for vertex or edges to read. the format is List(property1,property2,...)
, The default value is List()
, indicating that all properties are read. withLimit
No Configure the number of rows of data read from the server by the NebulaGraph Java Storage Client at a time. The default value is 1000
. withPartitionNum
No Configures the number of Spark partitions to read the NebulaGraph data. The default value is 100
. This value should not exceed the number of slices in the graph space (partition_num). Note
The values of columns in a dataframe are automatically written to NebulaGraph as property values.
val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withGraphAddress(\"127.0.0.1:9669\")\n .withConenctionRetry(2)\n .build()\n\nval nebulaWriteVertexConfig: WriteNebulaVertexConfig = WriteNebulaVertexConfig \n .builder()\n .withSpace(\"test\")\n .withTag(\"person\")\n .withVidField(\"id\")\n .withVidPolicy(\"hash\")\n .withVidAsProp(true)\n .withUser(\"root\")\n .withPasswd(\"nebula\")\n .withBatch(1000)\n .build() \ndf.write.nebula(config, nebulaWriteVertexConfig).writeVertices()\n\nval nebulaWriteEdgeConfig: WriteNebulaEdgeConfig = WriteNebulaEdgeConfig \n .builder()\n .withSpace(\"test\")\n .withEdge(\"friend\")\n .withSrcIdField(\"src\")\n .withSrcPolicy(null)\n .withDstIdField(\"dst\")\n .withDstPolicy(null)\n .withRankField(\"degree\")\n .withSrcAsProperty(true)\n .withDstAsProperty(true)\n .withRankAsProperty(true)\n .withUser(\"root\")\n .withPasswd(\"nebula\")\n .withBatch(1000)\n .build()\ndf.write.nebula(config, nebulaWriteEdgeConfig).writeEdges()\n
The default write mode is insert
, which can be changed to update
or delete
via withWriteMode
configuration:
val config = NebulaConnectionConfig\n .builder()\n .withMetaAddress(\"127.0.0.1:9559\")\n .withGraphAddress(\"127.0.0.1:9669\")\n .build()\nval nebulaWriteVertexConfig = WriteNebulaVertexConfig\n .builder()\n .withSpace(\"test\")\n .withTag(\"person\")\n .withVidField(\"id\")\n .withVidAsProp(true)\n .withBatch(1000)\n .withWriteMode(WriteMode.UPDATE)\n .build()\ndf.write.nebula(config, nebulaWriteVertexConfig).writeVertices()\n
NebulaConnectionConfig
is the configuration for connecting to the nebula graph, as described below.
withMetaAddress
Yes Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. withGraphAddress
Yes Specifies the IP addresses and ports of Graph Services. Separate multiple addresses with commas. The format is ip1:port1,ip2:port2,...
. withConnectionRetry
No Number of retries that the NebulaGraph Java Client connected to NebulaGraph. The default value is 1
. WriteNebulaVertexConfig
is the configuration of the write vertex, as described below.
withSpace
Yes NebulaGraph space name. withTag
Yes The Tag name that needs to be associated when a vertex is written. withVidField
Yes The column in the DataFrame as the vertex ID. withVidPolicy
No When writing the vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withVidAsProp
No Whether the column in the DataFrame that is the vertex ID is also written as an property. The default value is false
. If set to true
, make sure the Tag has the same property name as VidField
. withUser
No NebulaGraph username. If authentication is disabled, you do not need to configure the username and password. withPasswd
No The password for the NebulaGraph username. withBatch
Yes The number of rows of data written at a time. The default value is 1000
. withWriteMode
No Write mode. The optional values are insert
, update
and delete
. The default value is insert
. withDeleteEdge
No Whether to delete the related edges synchronously when deleting a vertex. The default value is false
. It takes effect when withWriteMode
is delete
. WriteNebulaEdgeConfig
is the configuration of the write edge, as described below.
withSpace
Yes NebulaGraph space name. withEdge
Yes The Edge type name that needs to be associated when a edge is written. withSrcIdField
Yes The column in the DataFrame as the vertex ID. withSrcPolicy
No When writing the starting vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withDstIdField
Yes The column in the DataFrame that serves as the destination vertex. withDstPolicy
No When writing the destination vertex ID, NebulaGraph use mapping function, supports HASH only. No mapping is performed by default. withRankField
No The column in the DataFrame as the rank. Rank is not written by default. withSrcAsProperty
No Whether the column in the DataFrame that is the starting vertex is also written as an property. The default value is false
. If set to true
, make sure Edge type has the same property name as SrcIdField
. withDstAsProperty
No Whether column that are destination vertex in the DataFrame are also written as property. The default value is false
. If set to true
, make sure Edge type has the same property name as DstIdField
. withRankAsProperty
No Whether column in the DataFrame that is the rank is also written as property.The default value is false
. If set to true
, make sure Edge type has the same property name as RankField
. withUser
No NebulaGraph username. If authentication is disabled, you do not need to configure the username and password. withPasswd
No The password for the NebulaGraph username. withBatch
Yes The number of rows of data written at a time. The default value is 1000
. withWriteMode
No Write mode. The optional values are insert
, update
and delete
. The default value is insert
. NebulaGraph Algorithm (Algorithm) is a Spark application based on GraphX. It uses a complete algorithm tool to perform graph computing on the data in the NebulaGraph database by submitting a Spark task. You can also programmatically use the algorithm under the lib repository to perform graph computing on DataFrame.
"},{"location":"graph-computing/nebula-algorithm/#version_compatibility","title":"Version compatibility","text":"The correspondence between the NebulaGraph Algorithm release and the NebulaGraph core release is as follows.
NebulaGraph NebulaGraph Algorithm nightly 3.0-SNAPSHOT 3.0.0 ~ 3.4.x 3.x.0 2.6.x 2.6.x 2.5.0\u30012.5.1 2.5.0 2.0.0\u30012.0.1 2.1.0"},{"location":"graph-computing/nebula-algorithm/#prerequisites","title":"Prerequisites","text":"Before using the NebulaGraph Algorithm, users need to confirm the following information:
Graph computing outputs vertex datasets, and the algorithm results are stored in DataFrames as the properties of vertices. You can do further operations such as statistics and filtering according to your business requirements.
!!!
Before Algorithm v3.1.0, when submitting the algorithm package directly, the data of the vertex ID must be an integer. That is, the vertex ID can be INT or String, but the data itself is an integer.\n
"},{"location":"graph-computing/nebula-algorithm/#supported_algorithms","title":"Supported algorithms","text":"The graph computing algorithms supported by NebulaGraph Algorithm are as follows.
Algorithm Description Scenario Properties name Properties type PageRank The rank of pages Web page ranking, key node mining pagerank double/string Louvain Louvain Community mining, hierarchical clustering louvain int/string KCore K core Community discovery, financial risk control kcore int/string LabelPropagation Label propagation Information spreading, advertising, and community discovery lpa int/string Hanp Label propagation advanced Community discovery, recommendation system hanp int/string ConnectedComponent Weakly connected component Community discovery, island discovery cc int/string StronglyConnectedComponent Strongly connected component Community discovery scc int/string ShortestPath The shortest path Path planning, network planning shortestpath string TriangleCount Triangle counting Network structure analysis trianglecount int/string GraphTriangleCount Graph triangle counting Network structure and tightness analysis count int BetweennessCentrality Intermediate centrality Key node mining, node influence computing betweenness double/string ClosenessCentrality Closeness centrality Key node mining, node influence computing closeness double/string DegreeStatic Degree of statistical Graph structure analysis degree,inDegree,outDegree int/string ClusteringCoefficient Aggregation coefficient Recommendation system, telecom fraud analysis clustercoefficient double/string Jaccard Jaccard similarity Similarity computing, recommendation system jaccard string BFS Breadth-First Search Sequence traversal, shortest path planning bfs string DFS Depth-First Search Sequence traversal, shortest path planning dfs string Node2Vec - Graph classification node2vec stringNote
When writing the algorithm results into the NebulaGraph, make sure that the tag in the corresponding graph space has properties names and data types corresponding to the table above.
"},{"location":"graph-computing/nebula-algorithm/#implementation_methods","title":"Implementation methods","text":"NebulaGraph Algorithm implements the graph calculating as follows:
Read the graph data of DataFrame from the NebulaGraph database using the NebulaGraph Spark Connector.
Transform the graph data of DataFrame to the GraphX graph.
Use graph algorithms provided by GraphX (such as PageRank) or self-implemented algorithms (such as Louvain).
For detailed implementation methods, see Scala file.
"},{"location":"graph-computing/nebula-algorithm/#get_nebulagraph_algorithm","title":"Get NebulaGraph Algorithm","text":""},{"location":"graph-computing/nebula-algorithm/#compile_and_package","title":"Compile and package","text":"Clone the repository nebula-algorithm
.
$ git clone -b v3.0.0 https://github.com/vesoft-inc/nebula-algorithm.git\n
Enter the directory nebula-algorithm
.
$ cd nebula-algorithm\n
Compile and package.
$ mvn clean package -Dgpg.skip -Dmaven.javadoc.skip=true -Dmaven.test.skip=true\n
After the compilation, a similar file nebula-algorithm-3.x.x.jar
is generated in the directory nebula-algorithm/target
.
Download address
"},{"location":"graph-computing/nebula-algorithm/#how_to_use","title":"How to use","text":"Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
"},{"location":"graph-computing/nebula-algorithm/#use_algorithm_interface_recommended","title":"Use algorithm interface (recommended)","text":"The lib
repository provides 10 common graph algorithms.
Add dependencies to the file pom.xml
.
<dependency>\n <groupId>com.vesoft</groupId>\n <artifactId>nebula-algorithm</artifactId>\n <version>3.0.0</version>\n</dependency>\n
Use the algorithm (take PageRank as an example) by filling in parameters. For more examples, see example.
Note
By default, the DataFrame that executes the algorithm sets the first column as the starting vertex, the second column as the destination vertex, and the third column as the edge weights (not the rank in the NebulaGraph).
val prConfig = new PRConfig(5, 1.0)\nval prResult = PageRankAlgo.apply(spark, data, prConfig, false)\n
If your vertex IDs are Strings, see Pagerank Example for how to encoding and decoding them.
Set the Configuration file.
{\n # Configurations related to Spark\n spark: {\n app: {\n name: LPA\n # The number of partitions of Spark\n partitionNum:100\n }\n master:local\n }\n\n data: {\n # Data source. Optional values are nebula, csv, and json.\n source: csv\n # Data sink. The algorithm result will be written into this sink. Optional values are nebula, csv, and text.\n sink: nebula\n # Whether the algorithm has a weight.\n hasWeight: false\n }\n\n # Configurations related to NebulaGraph\n nebula: {\n # Data source. When NebulaGraph is the data source of the graph computing, the configuration of `nebula.read` is valid.\n read: {\n # The IP addresses and ports of all Meta services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy NebulaGraph by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the status with `docker-compose ps`.\n metaAddress: \"192.168.*.10:9559\"\n # The name of the graph space in NebulaGraph.\n space: basketballplayer\n # Edge types in NebulaGraph. When there are multiple labels, the data of multiple edges will be merged.\n labels: [\"serve\"]\n # The property name of each edge type in NebulaGraph. This property will be used as the weight column of the algorithm. Make sure that it corresponds to the edge type.\n weightCols: [\"start_year\"]\n }\n\n # Data sink. When the graph computing result sinks into NebulaGraph, the configuration of `nebula.write` is valid.\n write:{\n # The IP addresses and ports of all Graph services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the status with `docker-compose ps`.\n graphAddress: \"192.168.*.11:9669\"\n # The IP addresses and ports of all Meta services. Multiple addresses are separated by commas (,). Example: \"ip1:port1,ip2:port2\".\n # To deploy NebulaGraph by using Docker Compose, fill in the port with which Docker Compose maps to the outside.\n # Check the staus with `docker-compose ps`.\n metaAddress: \"192.168.*.12:9559\"\n user:root\n pswd:nebula\n # Before submitting the graph computing task, create the graph space and tag.\n # The name of the graph space in NebulaGraph.\n space:nb\n # The name of the tag in NebulaGraph. The graph computing result will be written into this tag. The property name of this tag is as follows.\n # PageRank: pagerank\n # Louvain: louvain\n # ConnectedComponent: cc\n # StronglyConnectedComponent: scc\n # LabelPropagation: lpa\n # ShortestPath: shortestpath\n # DegreeStatic: degree,inDegree,outDegree\n # KCore: kcore\n # TriangleCount: tranglecpunt\n # BetweennessCentrality: betweennedss\n tag:pagerank\n }\n } \n\n local: {\n # Data source. When the data source is csv or json, the configuration of `local.read` is valid.\n read:{\n filePath: \"hdfs://127.0.0.1:9000/edge/work_for.csv\"\n # If the CSV file has a header or it is a json file, use the header. If not, use [_c0, _c1, _c2, ..., _cn] instead.\n # The header of the source VID column.\n srcId:\"_c0\"\n # The header of the destination VID column.\n dstId:\"_c1\"\n # The header of the weight column.\n weight: \"_c2\"\n # Whether the csv file has a header.\n header: false\n # The delimiter in the csv file.\n delimiter:\",\"\n }\n\n # Data sink. When the graph computing result sinks to the csv or text file, the configuration of `local.write` is valid.\n write:{\n resultPath:/tmp/\n }\n }\n\n\n algorithm: {\n # The algorithm to execute. Optional values are as follow: \n # pagerank, louvain, connectedcomponent, labelpropagation, shortestpaths, \n # degreestatic, kcore, stronglyconnectedcomponent, trianglecount ,\n # betweenness, graphtriangleCount.\n executeAlgo: pagerank\n\n # PageRank\n pagerank: {\n maxIter: 10\n resetProb: 0.15 \n encodeId:false # Configure true if the VID is of string type.\n }\n\n # Louvain\n louvain: {\n maxIter: 20\n internalIter: 10\n tol: 0.5\n encodeId:false # Configure true if the VID is of string type.\n }\n\n # ...\n\n}\n}\n
Note
When sink: nebula
is configured, it means that the algorithm results will be written back to the NebulaGraph cluster. The property names of the tag have implicit conventions. For details, see Supported algorithms section of this topic.
Submit the graph computing task.
${SPARK_HOME}/bin/spark-submit --master <mode> --class com.vesoft.nebula.algorithm.Main <nebula-algorithm-3.0.0.jar_path> -p <application.conf_path>\n
Example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.algorithm.Main /root/nebula-algorithm/target/nebula-algorithm-3.0-SNAPSHOT.jar -p /root/nebula-algorithm/src/main/resources/application.conf\n
NebulaGraph Importer (Importer) is a standalone tool for importing data from CSV files into NebulaGraph. Importer can read and batch import CSV file data from multiple data sources, and also supports batch update and delete operations.
"},{"location":"import-export/use-importer/#features","title":"Features","text":"The version correspondence between NebulaGraph and NebulaGraph Importer is as follows.
NebulaGraph version NebulaGraph Importer version 3.x.x 3.x.x, 4.x.x 2.x.x 2.x.x, 3.x.xNote
Importer 4.0.0 has redone the Importer for improved performance, but the configuration file is not compatible with older versions. It is recommended to use the new version of Importer.
"},{"location":"import-export/use-importer/#release_note","title":"Release note","text":"Release
"},{"location":"import-export/use-importer/#prerequisites","title":"Prerequisites","text":"Before using NebulaGraph Importer, make sure:
NebulaGraph service has been deployed. The deployment method is as follows:
manager.hooks.before.statements
.Prepare the CSV file to be imported and configure the YAML file to use the tool to batch write data into NebulaGraph.
Note
For details about the YAML configuration file, see Configuration File Description at the end of topic.
"},{"location":"import-export/use-importer/#download_binary_package_and_run","title":"Download binary package and run","text":"Download the executable binary package.
Note
The file installation path based on the RPM/DEB package is /usr/bin/nebula-importer
.
Under the directory where the binary file is located, run the following command to start importing data.
./<binary_file_name> --config <yaml_config_file_path>\n
Compiling the source code requires deploying a Golang environment. For details, see Build Go environment.
Clone repository.
git clone -b release-4.1 https://github.com/vesoft-inc/nebula-importer.git\n
Note
Use the correct branch. Different branches have different RPC protocols.
Access the directory nebula-importer
.
cd nebula-importer\n
Compile the source code.
make build\n
Start the service.
./bin/nebula-importer --config <yaml_config_file_path>\n
Instead of installing the Go locale locally, you can use Docker to pull the image of the NebulaGraph Importer and mount the local configuration file and CSV data file into the container. The command is as follows:
docker pull vesoft/nebula-importer:<version>\ndocker run --rm -ti \\\n --network=host \\\n -v <config_file>:<config_file> \\\n -v <data_dir>:<data_dir> \\\n vesoft/nebula-importer:<version> \\\n --config <config_file>\n
<config_file>
: The absolute path to the YAML configuration file.<data_dir>
: The absolute path to the CSV data file. If the file is not local, ignore this parameter.<version>
: NebulaGraph 3.x Please fill in 'v3'.Note
A relative path is recommended. If you use a local absolute path, check that the path maps to the path in the Docker.
Example:
docker pull vesoft/nebula-importer:v4\ndocker run --rm -ti \\\n --network=host \\\n -v /home/user/config.yaml:/home/user/config.yaml \\\n -v /home/user/data:/home/user/data \\\n vesoft/nebula-importer:v4 \\\n --config /home/user/config.yaml\n
"},{"location":"import-export/use-importer/#configuration_file_description","title":"Configuration File Description","text":"Various example configuration files are available within the Github of the NebulaGraph Importer. The configuration files are used to describe information about the files to be imported, NebulaGraph server information, etc. The following section describes the fields within the configuration file in categories.
Note
If users download a binary package, create the configuration file manually.
"},{"location":"import-export/use-importer/#client_configuration","title":"Client configuration","text":"Client configuration stores the configuration associated with the client's connection to the NebulaGraph.
The example configuration is as follows:
client:\n version: v3\n address: \"192.168.1.100:9669,192.168.1.101:9669\"\n user: root\n password: nebula\n ssl:\n enable: true\n certPath: \"/home/xxx/cert/importer.crt\"\n keyPath: \"/home/xxx/cert/importer.key\"\n caPath: \"/home/xxx/cert/root.crt\"\n insecureSkipVerify: false\n concurrencyPerAddress: 10\n reconnectInitialInterval: 1s\n retry: 3\n retryInitialInterval: 1s\n
Parameter Default value Required Description client.version
v3
Yes Specifies the major version of the NebulaGraph. Currently only v3
is supported. client.address
\"127.0.0.1:9669\"
Yes Specifies the address of the NebulaGraph. Multiple addresses are separated by commas. client.user
root
No NebulaGraph user name. client.password
nebula
No The password for the NebulaGraph user name. client.ssl.enable
false
No Specifies whether to enable SSL authentication. client.ssl.certPath
- No Specifies the storage path for the SSL public key certificate.This parameter is required when SSL authentication is enabled. client.ssl.keyPath
- No S pecifies the storage path for the SSL key.This parameter is required when SSL authentication is enabled. client.ssl.caPath
- No Specifies the storage path for the CA root certificate.This parameter is required when SSL authentication is enabled. client.ssl.insecureSkipVerify
false
No Specifies whether the client skips verifying the server's certificate chain and hostname. If set to true
, any certificate chain and hostname provided by the server is accepted. client.concurrencyPerAddress
10
No The number of concurrent client connections for a single graph service. client.retryInitialInterval
1s
No Reconnect interval time. client.retry
3
No The number of retries for failed execution of the nGQL statement. client.retryInitialInterval
1s
No Retry interval time."},{"location":"import-export/use-importer/#manager_configuration","title":"Manager configuration","text":"Manager configuration is a human-controlled configuration after connecting to the database.
The example configuration is as follows:
manager:\n spaceName: basic_string_examples\n batch: 128\n readerConcurrency: 50\n importerConcurrency: 512\n statsInterval: 10s\n hooks:\n before:\n - statements:\n - UPDATE CONFIGS storage:wal_ttl=3600;\n - UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = true };\n - statements:\n - |\n DROP SPACE IF EXISTS basic_string_examples;\n CREATE SPACE IF NOT EXISTS basic_string_examples(partition_num=5, replica_factor=1, vid_type=int);\n USE basic_string_examples;\n wait: 10s\n after:\n - statements:\n - |\n UPDATE CONFIGS storage:wal_ttl=86400;\n UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = false };\n
Parameter Default value Required Description manager.spaceName
- Yes Specifies the NebulaGraph space to import the data into. Do not support importing multiple map spaces at the same time. manager.batch
128
No The batch size for executing statements (global configuration). Setting the batch size individually for a data source can using the parameter sources.batch
below. manager.readerConcurrency
50
No The number of concurrent reads of the data source by the reader. manager.importerConcurrency
512
No The number of concurrent nGQL statements generated to be executed, and then will call the client to execute these nGQL statements. manager.statsInterval
10s
No The time interval for printing statistical information manager.hooks.before.[].statements
- No The command to execute in the graph space before importing. manager.hooks.before.[].wait
- No The wait time after statements
are executed. manager.hooks.after.[].statements
- No The commands to execute in the graph space after importing. manager.hooks.after.[].wait
- No The wait time after statements
are executed."},{"location":"import-export/use-importer/#log_configuration","title":"Log configuration","text":"Log configuration is the logging-related configuration.
The example configuration is as follows:
log:\n level: INFO\n console: true\n files:\n - logs/nebula-importer.log\n
Parameter Default value Required Description log.level
INFO
No Specifies the log level. Optional values are DEBUG
, INFO
, WARN
, ERROR
, PANIC
, FATAL
. log.console
true
No Whether to print the logs to console synchronously when storing logs. log.files
- No The log file path. The log directory must exist."},{"location":"import-export/use-importer/#source_configuration","title":"Source configuration","text":"The Source configuration requires the configuration of data source information, data processing methods, and Schema mapping.
The example configuration is as follows:
sources:\n - path: ./person.csv # Required. Specifies the path where the data files are stored. If a relative path is used, the path and current configuration file directory are spliced. Wildcard filename is also supported, for example: ./follower-*.csv, please make sure that all matching files with the same schema.\n# - s3: # AWS S3\n# endpoint: endpoint # Optional. The endpoint of S3 service, can be omitted if using AWS S3.\n# region: us-east-1 # Required. The region of S3 service.\n# bucket: gdelt-open-data # Required. The bucket of file in S3 service.\n# key: events/20190918.export.csv # Required. The object key of file in S3 service.\n# accessKeyID: \"\" # Optional. The access key of S3 service. If it is public data, no need to configure.\n# accessKeySecret: \"\" # Optional. The secret key of S3 service. If it is public data, no need to configure.\n# - oss:\n# endpoint: https://oss-cn-hangzhou.aliyuncs.com # Required. The endpoint of OSS service.\n# bucket: bucketName # Required. The bucket of file in OSS service.\n# key: objectKey # Required. The object key of file in OSS service.\n# accessKeyID: accessKey # Required. The access key of OSS service.\n# accessKeySecret: secretKey # Required. The secret key of OSS service.\n# - ftp:\n# host: 192.168.0.10 # Required. The host of FTP service.\n# port: 21 # Required. The port of FTP service.\n# user: user # Required. The user of FTP service.\n# password: password # Required. The password of FTP service.\n# path: \"/events/20190918.export.csv\" # Required. The path of file in the FTP service.\n# - sftp:\n# host: 192.168.0.10 # Required. The host of SFTP service.\n# port: 22 # Required. The port of SFTP service.\n# user: user # Required. The user of SFTP service.\n# password: password # Optional. The password of SFTP service.\n# keyFile: keyFile # Optional. The ssh key file path of SFTP service.\n# keyData: keyData $ Optional. The ssh key file content of SFTP service.\n# passphrase: passphrase # Optional. The ssh key passphrase of SFTP service.\n# path: \"/events/20190918.export.csv\" # Required. The path of file in the SFTP service.\n# - hdfs:\n# address: \"127.0.0.1:8020\" # Required. The address of HDFS service.\n# user: \"hdfs\" # Optional. The user of HDFS service.\n# servicePrincipalName: <Kerberos Service Principal Name> # Optional. The name of the Kerberos service instance for the HDFS service when Kerberos authentication is enabled.\n# krb5ConfigFile: <Kerberos config file> # Optional. The path to the Kerberos configuration file for the HDFS service when Kerberos authentication is enabled. Defaults to `/etc/krb5.conf`.\n# ccacheFile: <Kerberos ccache file> # Optional. The path to the Kerberos ccache file for the HDFS service when Kerberos authentication is enabled.\n# keyTabFile: <Kerberos keytab file> # Optional. The path to the Kerberos keytab file for the HDFS service when Kerberos authentication is enabled.\n# password: <Kerberos password> # Optional. The Kerberos password for the HDFS service when Kerberos authentication is enabled.\n# dataTransferProtection: <Kerberos Data Transfer Protection> # Optional. The type of transport encryption when Kerberos authentication is enabled. Optional values are `authentication`, `integrity`, `privacy`.\n# disablePAFXFAST: false # Optional. Whether to disable the use of PA_FX_FAST for clients.\n# path: \"/events/20190918.export.csv\" # Required. The path to the file in the HDFS service. Wildcard filenames are also supported, e.g. `/events/*.export.csv`, make sure all matching files have the same schema.\n# - gcs: # Google Cloud Storage\n# bucket: chicago-crime-sample # Required. The name of the bucket in the GCS service.\n# key: stats/000000000000.csv # Required. The path to the file in the GCS service.\n# withoutAuthentication: false # Optional. Whether to anonymize access. Defaults to false, which means access with credentials.\n# # When using credentials access, one of the credentialsFile and credentialsJSON parameters is sufficient.\n# credentialsFile: \"/path/to/your/credentials/file\" # Optional. The path to the credentials file for the GCS service.\n# credentialsJSON: '{ # Optional. The JSON content of the credentials for the GCS service.\n# \"type\": \"service_account\",\n# \"project_id\": \"your-project-id\",\n# \"private_key_id\": \"key-id\",\n# \"private_key\": \"-----BEGIN PRIVATE KEY-----\\nxxxxx\\n-----END PRIVATE KEY-----\\n\",\n# \"client_email\": \"your-client@your-project-id.iam.gserviceaccount.com\",\n# \"client_id\": \"client-id\",\n# \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n# \"token_uri\": \"https://oauth2.googleapis.com/token\",\n# \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n# \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/your-client%40your-project-id.iam.gserviceaccount.com\",\n# \"universe_domain\": \"googleapis.com\"\n# }'\n batch: 256\n csv:\n delimiter: \"|\"\n withHeader: false\n lazyQuotes: false\n tags:\n - name: Person\n# mode: INSERT\n# filter: \n# expr: Record[1] == \"XXX\" \n id:\n type: \"STRING\"\n function: \"hash\"\n# index: 0 \n concatItems:\n - person_\n - 0\n - _id\n props:\n - name: \"firstName\"\n type: \"STRING\"\n index: 1\n - name: \"lastName\"\n type: \"STRING\"\n index: 2\n - name: \"gender\"\n type: \"STRING\"\n index: 3\n nullable: true\n defaultValue: female\n - name: \"birthday\"\n type: \"DATE\"\n index: 4\n nullable: true\n nullValue: _NULL_\n - name: \"creationDate\"\n type: \"DATETIME\"\n index: 5\n - name: \"locationIP\"\n type: \"STRING\"\n index: 6\n - name: \"browserUsed\"\n type: \"STRING\"\n index: 7\n - path: ./knows.csv\n batch: 256\n edges:\n - name: KNOWS # person_knows_person\n# mode: INSERT\n# filter: \n# expr: Record[1] == \"XXX\"\n src:\n id:\n type: \"STRING\"\n concatItems:\n - person_\n - 0\n - _id\n dst:\n id:\n type: \"STRING\"\n concatItems:\n - person_\n - 1\n - _id\n props:\n - name: \"creationDate\"\n type: \"DATETIME\"\n index: 2\n nullable: true\n nullValue: _NULL_\n defaultValue: 0000-00-00T00:00:00\n
The configuration mainly includes the following parts:
sources.path
sources.s3
sources.oss
sources.ftp
sources.sftp
sources.hdfs
- No Specify data source information, such as local file, HDFS, and S3. Only one source can be configured for the source
. Configure multiple sources in multiple source
.See the comments in the example for configuration items for different data sources. sources.batch
256
No The batch size for executing statements when importing this data source. The priority is higher than manager.batch
. sources.csv.delimiter
,
No Specifies the delimiter for the CSV file. Only 1-character string separators are supported. Special characters like tabs (\\t
) and hexadecimal values (e.g., 0x03
or Ctrl+C
) must be properly escaped and enclosed in double quotes, such as \"\\t\"
for tabs and \"\\x03\"
or \"\\u0003\"
for hexadecimal values, instead of using single quotes. For details on escaping special characters in yaml format, see Escaped Characters. sources.csv.withHeader
false
No Whether to ignore the first record in the CSV file. sources.csv.lazyQuotes
false
No Whether to allow lazy quotes. If lazyQuotes
is true, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field. sources.tags.name
- Yes The tag name. sources.tags.mode
INSERT
No Batch operation types, including insert, update and delete. Optional values are INSERT
, UPDATE
and DELETE
. sources.tags.filter.expr
- No Filter the data and only import if the filter conditions are met. Supported comparison characters are ==
, ! =
, <
, >
, <=
and >=
. Logical operators supported are not
(!) , and
(&&) and or
(||). For example (Record[0] == \"Mahinda\" or Record[0] == \"Michael\") and Record[3] == \"male\"
. sources.tags.id.type
STRING
No The type of the VID. sources.tags.id.function
- No Functions to generate the VID. Currently, only function hash
are supported. sources.tags.id.index
- No The column number corresponding to the VID in the data file. If sources.tags.id.concatItems
is not configured, this parameter must be configured. sources.tags.id.concatItems
- No Used to concatenate two or more arrays, the concatenated items can be string
, int
or mixed. string
stands for a constant, int
for an index column. If this parameter is set, the sources.tags.id.index
parameter will not take effect. sources.tags.ignoreExistedIndex
true
No Whether to enable IGNORE_EXISTED_INDEX
, that is, do not update index after insertion vertex. sources.tags.props.name
- Yes The tag property name, which must match the Tag property in the database. sources.tags.props.type
STRING
No Property data type, supporting BOOL
, INT
, FLOAT
, DOUBLE
, STRING
, TIME
, TIMESTAMP
, DATE
, DATETIME
, GEOGRAPHY
, GEOGRAPHY(POINT)
, GEOGRAPHY(LINESTRING)
and GEOGRAPHY(POLYGON)
. sources.tags.props.index
- Yes The property corresponds to the column number in the data file. sources.tags.props.nullable
false
No Whether this prop property can be NULL
, optional values is true
or false
. sources.tags.props.nullValue
- No Ignored when nullable
is false
. The value used to determine whether it is a NULL
. The property is set to NULL
when the value is equal to nullValue
. sources.tags.props.alternativeIndices
- No Ignored when nullable
is false
. The property is fetched from records according to the indices in order until not equal to nullValue
. sources.tags.props.defaultValue
- No Ignored when nullable
is false
. The property default value, when all the values obtained by index
and alternativeIndices
are nullValue
. sources.edges.name
- Yes The edge type name. sources.edges.mode
INSERT
No Batch operation types, including insert, update and delete. Optional values are INSERT
, UPDATE
and DELETE
. sources.edges.filter.expr
- No Filter the data and only import if the filter conditions are met. Supported comparison characters are ==
, ! =
, <
, >
, <=
and >=
. Logical operators supported are not
(!) , and
(&&) and or
(||). For example (Record[0] == \"Mahinda\" or Record[0] == \"Michael\") and Record[3] == \"male\"
. sources.edges.src.id.type
STRING
No The data type of the VID at the starting vertex on the edge. sources.edges.src.id.index
- Yes The column number in the data file corresponding to the VID at the starting vertex on the edge. sources.edges.dst.id.type
STRING
No The data type of the VID at the destination vertex on the edge. sources.edges.dst.id.index
- Yes The column number in the data file corresponding to the VID at the destination vertex on the edge. sources.edges.rank.index
- No The column number in the data file corresponding to the rank on the edge. sources.edges.ignoreExistedIndex
true
No Whether to enable IGNORE_EXISTED_INDEX
, that is, do not update index after insertion vertex. sources.edges.props.name
- No The edge type property name, which must match the Tag property in the database. sources.edges.props.type
STRING
No Property data type, supporting BOOL
, INT
, FLOAT
, DOUBLE
, STRING
, TIME
, TIMESTAMP
, DATE
, DATETIME
, GEOGRAPHY
, GEOGRAPHY(POINT)
, GEOGRAPHY(LINESTRING)
and GEOGRAPHY(POLYGON)
. sources.edges.props.index
- No The property corresponds to the column number in the data file. sources.edges.props.nullable
- No Whether this prop property can be NULL
, optional values is true
or false
. sources.edges.props.nullValue
- No Ignored when nullable
is false
. The value used to determine whether it is a NULL
. The property is set to NULL
when the value is equal to nullValue
. sources.edges.props.defaultValue
- No Ignored when nullable
is false
. The property default value, when all the values obtained by index
and alternativeIndices
are nullValue
. Note
The sequence numbers of the columns in the CSV file start from 0, that is, the sequence numbers of the first column are 0, and the sequence numbers of the second column are 1.
"},{"location":"import-export/use-importer/#faq","title":"FAQ","text":""},{"location":"import-export/use-importer/#what_are_the_descriptions_of_the_fields_in_the_log_output","title":"What are the descriptions of the fields in the log output?","text":"The following is an example of a log entry:
\u201cmsg\u201d: \u201c44m20s 2h7m10s 25.85%(129 GiB/498 GiB) Records{Finished: 302016726, Failed: 0, Rate: 113538.13/s}, Requests{Finished: 181786, Failed: 0, Latency: 4.046496736s/4.06694393s, Rate: 68.34/s}, Processed{Finished: 908575178, Failed: 0, Rate: 341563.62/s}\u201d\n
The fields are described below:
44m20s 2h7m10s 25.85%(129 GiB/498 GiB)
corresponds to basic information about the importing process.Records
corresponds to the records of the CSV files.Finished
: The number of the completed records.Failed
: The number of the failed records.Rate
: The number of records imported per second.Requests
corresponds to the requests.Finished
: The number of the completed requests.Failed
: The number of the failed requests.Latency
: The time consumed by server-side requests / The time consumed by client-side requests.Rate
: The number of requests processed per second.Processed
corresponds to nodes and edges.Finished
: The number of the completed nodes and edges.Failed
: The number of the failed nodes and edges.Rate
: The number of nodes and edges processed per second.There are many ways to write NebulaGraph master:
The following figure shows the positions of these ways:
"},{"location":"import-export/write-tools/#export_tools","title":"Export tools","text":"Export the data in database to a CSV file or another graph space (different NebulaGraph database clusters are supported) using the export function of the Exchange.
Enterpriseonly
The export function is exclusively available in the Enterprise Edition. If you require access to this version, please contact us.
Could not resolve dependencies for project xxx
","text":"Please check the mirror
part of Maven installation directory libexec/conf/settings.xml
:
<mirror>\n <id>alimaven</id>\n <mirrorOf>central</mirrorOf>\n <name>aliyun maven</name>\n <url>http://maven.aliyun.com/nexus/content/repositories/central/</url>\n</mirror>\n
Check whether the value of mirrorOf
is configured to *
. If it is, change it to central
or *,!SparkPackagesRepo,!bintray-streamnative-maven
.
Reason: There are two dependency packages in Exchange's pom.xml
that are not in Maven's central repository. pom.xml
configures the repository address for these two dependencies. If the mirrorOf
value for the mirror address configured in Maven is *
, all dependencies will be downloaded from the Central repository, causing the download to fail.
Problem description: The system reports Could not find artifact com.vesoft:client:jar:xxx-SNAPSHOT
when compiling.
Cause: There is no local Maven repository for storing or downloading SNAPSHOT packages. The default central repository in Maven only stores official releases, not development versions (SNAPSHOT).
Solution: Add the following configuration in the profiles
scope of Maven's setting.xml
file:
<profile>\n <activation>\n <activeByDefault>true</activeByDefault>\n </activation>\n <repositories>\n <repository>\n <id>snapshots</id>\n <url>https://oss.sonatype.org/content/repositories/snapshots/</url>\n <snapshots>\n <enabled>true</enabled>\n </snapshots>\n </repository>\n </repositories>\n </profile>\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#execution","title":"Execution","text":""},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_javalangclassnotfoundexception_comvesoftnebulaexchangeexchange","title":"Q: Error: java.lang.ClassNotFoundException: com.vesoft.nebula.exchange.Exchange
","text":"To submit a task in Yarn-Cluster mode, run the following command, especially the two '--conf' commands in the example.
$SPARK_HOME/bin/spark-submit --class com.vesoft.nebula.exchange.Exchange \\\n--master yarn-cluster \\\n--files application.conf \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\nnebula-exchange-3.0.0.jar \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_method_name_xxx_not_found","title":"Q: Error: method name xxx not found
","text":"Generally, the port configuration is incorrect. Check the port configuration of the Meta service, Graph service, and Storage service.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_error_nosuchmethod_methodnotfound_exception_in_thread_main_javalangnosuchmethoderror_etc","title":"Q: Error: NoSuchMethod, MethodNotFound (Exception in thread \"main\" java.lang.NoSuchMethodError
, etc)","text":"Most errors are caused by JAR package conflicts or version conflicts. Check whether the version of the error reporting service is the same as that used in Exchange, especially Spark, Scala, and Hive.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_when_exchange_imports_hive_data_error_exception_in_thread_main_orgapachesparksqlanalysisexception_table_or_view_not_found","title":"Q: When Exchange imports Hive data, error:Exception in thread \"main\" org.apache.spark.sql.AnalysisException: Table or view not found
","text":"Check whether the -h
parameter is omitted in the command for submitting the Exchange task and whether the table and database are correct, and run the user-configured exec statement in spark-SQL to verify the correctness of the exec statement.
com.facebook.thrift.protocol.TProtocolException: Expected protocol id xxx
","text":"Check that the NebulaGraph service port is configured correctly.
--port
in the configuration file for each service.Execute docker-compose ps
in the nebula-docker-compose
directory, for example:
$ docker-compose ps\n Name Command State Ports\n---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\nnebula-docker-compose_graphd_1 /usr/local/nebula/bin/nebu ... Up (healthy) 0.0.0.0:33205->19669/tcp, 0.0.0.0:33204->19670/tcp, 0.0.0.0:9669->9669/tcp\nnebula-docker-compose_metad0_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33165->19559/tcp, 0.0.0.0:33162->19560/tcp, 0.0.0.0:33167->9559/tcp, 9560/tcp\nnebula-docker-compose_metad1_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33166->19559/tcp, 0.0.0.0:33163->19560/tcp, 0.0.0.0:33168->9559/tcp, 9560/tcp\nnebula-docker-compose_metad2_1 ./bin/nebula-metad --flagf ... Up (healthy) 0.0.0.0:33161->19559/tcp, 0.0.0.0:33160->19560/tcp, 0.0.0.0:33164->9559/tcp, 9560/tcp\nnebula-docker-compose_storaged0_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33180->19779/tcp, 0.0.0.0:33178->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33183->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged1_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33175->19779/tcp, 0.0.0.0:33172->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33177->9779/tcp, 9780/tcp\nnebula-docker-compose_storaged2_1 ./bin/nebula-storaged --fl ... Up (healthy) 0.0.0.0:33184->19779/tcp, 0.0.0.0:33181->19780/tcp, 9777/tcp, 9778/tcp, 0.0.0.0:33185->9779/tcp, 9780/tcp\n
Check the Ports
column to find the docker mapped port number, for example:
- The port number available for Graph service is 9669.
- The port number for Meta service are 33167, 33168, 33164.
- The port number for Storage service are 33183, 33177, 33185.
Exception in thread \"main\" com.facebook.thrift.protocol.TProtocolException: The field 'code' has been assigned the invalid value -4
","text":"Check whether the version of Exchange is the same as that of NebulaGraph. For more information, see Limitations.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_how_to_correct_the_encoding_error_when_importing_data_in_a_spark_environment","title":"Q: How to correct the encoding error when importing data in a Spark environment?","text":"It may happen if the property value of the data contains Chinese characters. The solution is to add the following options before the JAR package path in the import command:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
Namely:
<spark_install_path>/bin/spark-submit --master \"local\" \\\n--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--class com.vesoft.nebula.exchange.Exchange \\\n<nebula-exchange-3.x.y.jar_path> -c <application.conf_path>\n
In YARN, use the following command:
<spark_install_path>/bin/spark-submit \\\n--class com.vesoft.nebula.exchange.Exchange \\\n--master yarn-cluster \\\n--files <application.conf_path> \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\n--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8 \\\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8 \\\n<nebula-exchange-3.x.y.jar_path> \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_orgrocksdbrocksdbexception_while_open_a_file_for_appending_pathsst1-xxxsst_no_such_file_or_directory","title":"Q: org.rocksdb.RocksDBException: While open a file for appending: /path/sst/1-xxx.sst: No such file or directory","text":"Solution:
/path
exists. If not, or if the path is set incorrectly, create or correct it./path
. If not, grant the permission.- limit: Represents the size of the token bucket.
- timeout: Represents the timeout period for obtaining the token.
The values of these four parameters can be adjusted appropriately according to the machine performance. If the leader of the Storage service changes during the import process, you can adjust the values of these four parameters to reduce the import speed.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#others","title":"Others","text":""},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_which_versions_of_nebulagraph_are_supported_by_exchange","title":"Q: Which versions of NebulaGraph are supported by Exchange?","text":"See Limitations.
"},{"location":"import-export/nebula-exchange/ex-ug-FAQ/#q_what_is_the_relationship_between_exchange_and_spark_writer","title":"Q: What is the relationship between Exchange and Spark Writer?","text":"Exchange is the Spark application developed based on Spark Writer. Both are suitable for bulk migration of cluster data to NebulaGraph in a distributed environment, but later maintenance work will be focused on Exchange. Compared with Spark Writer, Exchange has the following improvements:
This topic introduces how to get the JAR file of NebulaGraph Exchange.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#download_the_jar_file_directly","title":"Download the JAR file directly","text":"The JAR file of Exchange Community Edition can be downloaded directly.
To download Exchange Enterprise Edition, contact us.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#get_the_jar_file_by_compiling_the_source_code","title":"Get the JAR file by compiling the source code","text":"You can get the JAR file of Exchange Community Edition by compiling the source code. The following introduces how to compile the source code of Exchange.
Enterpriseonly
You can get Exchange Enterprise Edition in NebulaGraph Enterprise Edition Package only.
"},{"location":"import-export/nebula-exchange/ex-ug-compile/#prerequisites","title":"Prerequisites","text":"Clone the repository nebula-exchange
in the /
directory.
git clone -b release-3.7 https://github.com/vesoft-inc/nebula-exchange.git\n
Switch to the directory nebula-exchange
.
cd nebula-exchange\n
Package NebulaGraph Exchange. Run the following command based on the Spark version:
For Spark 2.2\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_2.2 -am -Pscala-2.11 -Pspark-2.2\n
For Spark 2.4\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_2.4 -am -Pscala-2.11 -Pspark-2.4\n
For Spark 3.0\uff1a
mvn clean package -Dmaven.test.skip=true -Dgpg.skip -Dmaven.javadoc.skip=true \\\n-pl nebula-exchange_spark_3.0 -am -Pscala-2.12 -Pspark-3.0\n
After the compilation is successful, you can find the nebula-exchange_spark_x.x-release-3.7.jar
file in the nebula-exchange_spark_x.x/target/
directory. x.x
indicates the Spark version, for example, 2.4
.
Note
The JAR file version changes with the release of the NebulaGraph Java Client. Users can view the latest version on the Releases page.
When migrating data, you can refer to configuration file target/classes/application.conf
.
If downloading dependencies fails when compiling:
Modify the mirror
part of Maven installation directory libexec/conf/settings.xml
:
<mirror>\n <id>alimaven</id>\n <mirrorOf>central</mirrorOf>\n <name>aliyun maven</name>\n <url>http://maven.aliyun.com/nexus/content/repositories/central/</url>\n</mirror>\n
This topic describes some of the limitations of using Exchange 3.x.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-limitations/#environment","title":"Environment","text":"Exchange 3.x supports the following operating systems:
To ensure the healthy operation of Exchange, ensure that the following software has been installed on the machine:
Apache Spark. The requirements for Spark versions when using Exchange to export data from data sources are as follows. In the following table, Y means that the corresponding Spark version is supported, and N means not supported.
Note
Use the correct Exchange JAR file based on the Spark version. For example, for Spark version 2.4, use nebula-exchange_spark_2.4-3.7.0.jar.
Data source Spark 2.2 Spark 2.4 Spark 3 CSV file Y N Y JSON file Y Y Y ORC file Y Y Y Parquet file Y Y Y HBase Y Y Y MySQL Y Y Y PostgreSQL Y Y Y Oracle Y Y Y ClickHouse Y Y Y Neo4j N Y N Hive Y Y Y MaxCompute N Y N Pulsar N Y Untested Kafka N Y Untested NebulaGraph N Y NHadoop Distributed File System (HDFS) needs to be deployed in the following scenarios:
NebulaGraph Exchange (Exchange) is an Apache Spark\u2122 application for bulk migration of cluster data to NebulaGraph in a distributed environment, supporting batch and streaming data migration in a variety of formats.
Exchange consists of Reader, Processor, and Writer. After Reader reads data from different sources and returns a DataFrame, the Processor iterates through each row of the DataFrame and obtains the corresponding value based on the mapping between fields
in the configuration file. After iterating through the number of rows in the specified batch, Writer writes the captured data to the NebulaGraph at once. The following figure illustrates the process by which Exchange completes the data conversion and migration.
Exchange has two editions, the Community Edition and the Enterprise Edition. The Community Edition is open source developed on GitHub. The Enterprise Edition supports not only the functions of the Community Edition but also adds additional features. For details, see Comparisons.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#scenarios","title":"Scenarios","text":"Exchange applies to the following scenarios:
The data saved in NebulaGraph needs to be exported.
Enterpriseonly
Exporting the data saved in NebulaGraph is supported by Exchange Enterprise Edition only.
Exchange has the following advantages:
Resumable data import: It supports resumable data import to save time and improve data import efficiency.
Note
Resumable data import is currently supported when migrating Neo4j data only.
Exchange supports Spark versions 2.2.x, 2.4.x, and 3.x.x, which are named nebula-exchange_spark_2.2
, nebula-exchange_spark_2.4
, and nebula-exchange_spark_3.0
for different Spark versions.
The correspondence between the NebulaGraph Exchange version (the JAR version), the NebulaGraph core version and the Spark version is as follows.
Exchange version NebulaGraph version Spark version nebula-exchange_spark_3.0-3.0-SNAPSHOT.jar nightly 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar nightly 2.4.x nebula-exchange_spark_2.2-3.0-SNAPSHOT.jar nightly 2.2.x nebula-exchange_spark_3.0-3.4.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.4.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.4.0.jar 3.x.x 2.2.x nebula-exchange_spark_3.0-3.3.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.3.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.3.0.jar 3.x.x 2.2.x nebula-exchange_spark_3.0-3.0.0.jar 3.x.x 3.3.x\u30013.2.x\u30013.1.x\u30013.0.x nebula-exchange_spark_2.4-3.0.0.jar 3.x.x 2.4.x nebula-exchange_spark_2.2-3.0.0.jar 3.x.x 2.2.x nebula-exchange-2.6.3.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.2.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.1.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.6.0.jar 2.6.1\u30012.6.0 2.4.x nebula-exchange-2.5.2.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.5.1.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.5.0.jar 2.5.1\u30012.5.0 2.4.x nebula-exchange-2.1.0.jar 2.0.1\u30012.0.0 2.4.x nebula-exchange-2.0.1.jar 2.0.1\u30012.0.0 2.4.x nebula-exchange-2.0.0.jar 2.0.1\u30012.0.0 2.4.x"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#data_source","title":"Data source","text":"Exchange 3.7.0 supports converting data from the following formats or sources into vertexes and edges that NebulaGraph can recognize, and then importing them into NebulaGraph in the form of nGQL statements:
Data repository:
In addition to importing data as nGQL statements, Exchange supports generating SST files for data sources and then importing SST files via Console.
"},{"location":"import-export/nebula-exchange/about-exchange/ex-ug-what-is-exchange/#release_note","title":"Release note","text":"Release
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/","title":"Options for import","text":"After editing the configuration file, run the following commands to import specified source data into the NebulaGraph database.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/#import_data","title":"Import data","text":"<spark_install_path>/bin/spark-submit --master \"spark://HOST:PORT\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> \n
Note
If the value of the properties contains Chinese characters, the encoding error may appear. Please add the following options when submitting the Spark task:
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8\n--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8\n
The following table lists command parameters.
Parameter Required Default value Description--class
Yes - Specify the main class of the driver. --master
Yes - Specify the URL of the master process in a Spark cluster. For more information, see master-urls. Optional values are:local
: Local Mode. Run Spark applications on a single thread. Suitable for importing small data sets in a test environment.yarn
: Run Spark applications on a YARN cluster. Suitable for importing large data sets in a production environment.spark://HOST:PORT
: Connect to the specified Spark standalone cluster.mesos://HOST:PORT
: Connect to the specified Mesos cluster.k8s://HOST:PORT
: Connect to the specified Kubernetes cluster. -c
/--config
Yes - Specify the path of the configuration file. -h
/--hive
No false
Specify whether importing Hive data is supported. -D
/--dry
No false
Specify whether to check the format of the configuration file. This parameter is used to check the format of the configuration file only, it does not check the validity of tags
and edges
configurations and does not import data. Don't add this parameter if you need to import data. -r
/--reload
No - Specify the path of the reload file that needs to be reloaded. For more Spark parameter configurations, see Spark Configuration.
Note
$SPARK_HOME/bin/spark-submit --master yarn \\\n--class com.vesoft.nebula.exchange.Exchange \\\n--files application.conf \\\n--conf spark.driver.extraClassPath=./ \\\n--conf spark.executor.extraClassPath=./ \\\nnebula-exchange-3.7.0.jar \\\n-c application.conf\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command/#import_the_reload_file","title":"Import the reload file","text":"If some data fails to be imported during the import, the failed data will be stored in the reload file. Use the parameter -r
to import the data in reload file.
<spark_install_path>/bin/spark-submit --master \"spark://HOST:PORT\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> -r \"<reload_file_path>\" \n
If the import still fails, go to Official Forum for consultation.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/","title":"Parameters in the configuration file","text":"This topic describes how to automatically generate a template configuration file when users use NebulaGraph Exchange, and introduces the configuration file application.conf
.
Specify the data source to be imported with the following command to get the template configuration file corresponding to the data source.
java -cp <exchange_jar_package> com.vesoft.exchange.common.GenerateConfigTemplate -s <source_type> -p <config_file_save_path>\n
Example:
java -cp nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar com.vesoft.exchange.common.GenerateConfigTemplate -s csv -p /home/nebula/csv_application.conf\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#using_an_encrypted_password","title":"Using an encrypted password","text":"You can use either a plaintext password or an RSA encrypted password when setting the password for connecting to NebulaGraph in the configuration file.
To use an RSA-encrypted password, you need to configure the following settings in the configuration file:
nebula.pswd
is configured as the RSA encrypted password.nebula.privateKey
is configured as the key for RSA encryption.nebula.enableRSA
is configured as true
.Users can use their own tools for encryption, or they can use the encryption tool provided in Exchange's JAR package, for example:
spark-submit --master local --class com.vesoft.exchange.common.PasswordEncryption nebula-exchange_spark_2.4-3.0-SNAPSHOT.jar -p nebula\n
The results returned are as follows:
=================== public key begin ===================\nMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCLl7LaNSEXlZo2hYiJqzxgyFBQdkxbQXYU/xQthsBJwjOPhkiY37nokzKnjNlp6mv5ZUomqxLsoNQHEJ6BZD4VPiaiElFAkTD+gyul1v8f3A446Fr2rnVLogWHnz8ECPt7X8jwmpiKOXkOPIhqU5E0Cua+Kk0nnVosbos/VShfiQIDAQAB\n=================== public key end ===================\n\n\n=================== private key begin ===================\nMIICeAIBADANBgkqhkiG9w0BAQEFAASCAmIwggJeAgEAAoGBAIuXsto1IReVmjaFiImrPGDIUFB2TFtBdhT/FC2GwEnCM4+GSJjfueiTMqeM2Wnqa/llSiarEuyg1AcQnoFkPhU+JqISUUCRMP6DK6XW/x/cDjjoWvaudUuiBYefPwQI+3tfyPCamIo5eQ48iGpTkTQK5r4qTSedWixuiz9VKF+JAgMBAAECgYADWbfEPwQ1UbTq3Bej3kVLuWMcG0rH4fFYnaq5UQOqgYvFRR7W9H+80lOj6+CIB0ViLgkylmaU4WNVbBOx3VsUFFWSqIIIviKubg8m8ey7KAd9X2wMEcUHi4JyS2+/WSacaXYS5LOmMevvuaOwLEV0QmyM+nNGRIjUdzCLR1935QJBAM+IF8YD5GnoAPPjGIDS1Ljhu/u/Gj6/YBCQKSHQ5+HxHEKjQ/YxQZ/otchmMZanYelf1y+byuJX3NZ04/KSGT8CQQCsMaoFO2rF5M84HpAXPi6yH2chbtz0VTKZworwUnpmMVbNUojf4VwzAyOhT1U5o0PpFbpi+NqQhC63VUN5k003AkEArI8vnVGNMlZbvG7e5/bmM9hWs2viSbxdB0inOtv2g1M1OV+B2gp405ru0/PNVcRV0HQFfCuhVfTSxmspQoAihwJBAJW6EZa/FZbB4JVxreUoAr6Lo8dkeOhT9M3SZbGWZivaFxot/Cp/8QXCYwbuzrJxjqlsZUeOD6694Uk08JkURn0CQQC8V6aRa8ylMhLJFkGkMDHLqHcQCmY53Kd73mUu4+mjMJLZh14zQD9ydFtc0lbLXTeBAMWV3uEdeLhRvdAo3OwV\n=================== private key end ===================\n\n\n=================== encrypted password begin ===================\nIo+3y3mLOMnZJJNUPHZ8pKb4VfTvg6wUh6jSu5xdmLAoX/59tK1HTwoN40aOOWJwa1a5io7S4JqcX/jEcAorw7pelITr+F4oB0AMCt71d+gJuu3/lw9bjUEl9tF4Raj82y2Dg39wYbagN84fZMgCD63TPiDIevSr6+MFKASpGrY=\n=================== encrypted password end ===================\ncheck: the real password decrypted by private key and encrypted password is: nebula\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#configuration_instructions","title":"Configuration instructions","text":"Before configuring the application.conf
file, it is recommended to copy the file name application.conf
and then edit the file name according to the file type of a data source. For example, change the file name to csv_application.conf
if the file type of the data source is CSV.
The application.conf
file contains the following content types:
This topic lists only some Spark parameters. For more information, see Spark Configuration.
Parameter Type Default value Required Descriptionspark.app.name
string - No The drive name in Spark. spark.driver.cores
int 1
No The number of CPU cores used by a driver, only applicable to a cluster mode. spark.driver.maxResultSize
string 1G
No The total size limit (in bytes) of the serialized results of all partitions in a single Spark operation (such as collect). The minimum value is 1M, and 0 means unlimited. spark.executor.memory
string 1G
No The amount of memory used by a Spark driver which can be specified in units, such as 512M or 1G. spark.cores.max
int 16
No The maximum number of CPU cores of applications requested across clusters (rather than from each node) when a driver runs in a coarse-grained sharing mode on a standalone cluster or a Mesos cluster. The default value is spark.deploy.defaultCores
on a Spark standalone cluster manager or the value of the infinite
parameter (all available cores) on Mesos."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#hive_configurations_optional","title":"Hive configurations (optional)","text":"Users only need to configure parameters for connecting to Hive if Spark and Hive are deployed in different clusters. Otherwise, please ignore the following configurations.
Parameter Type Default value Required Descriptionhive.warehouse
string - Yes The warehouse path in HDFS. Enclose the path in double quotes and start with hdfs://
. hive.connectionURL
string - Yes The URL of a JDBC connection. For example, \"jdbc:mysql://127.0.0.1:3306/hive_spark?characterEncoding=UTF-8\"
. hive.connectionDriverName
string \"com.mysql.jdbc.Driver\"
Yes The driver name. hive.connectionUserName
list[string] - Yes The username for connections. hive.connectionPassword
list[string] - Yes The account password."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#nebulagraph_configurations","title":"NebulaGraph configurations","text":"Parameter Type Default value Required Description nebula.address.graph
list[string] [\"127.0.0.1:9669\"]
Yes The addresses of all Graph services, including IPs and ports, separated by commas (,). Example: [\"ip1:port1\",\"ip2:port2\",\"ip3:port3\"]
. nebula.address.meta
list[string] [\"127.0.0.1:9559\"]
Yes The addresses of all Meta services, including IPs and ports, separated by commas (,). Example: [\"ip1:port1\",\"ip2:port2\",\"ip3:port3\"]
. nebula.user
string - Yes The username with write permissions for NebulaGraph. nebula.pswd
string - Yes The account password. The password can be plaintext or RSA encrypted. To use an RSA encrypted password, you need to set enableRSA
and privateKey
. For how to encrypt a password, see Using an encrypted password above. nebula.enableRSA
bool false
No Whether to use an RSA encrypted password. nebula.privateKey
string - No The key used to encrypt the password using RSA. nebula.space
string - Yes The name of the graph space where data needs to be imported. nebula.ssl.enable.graph
bool false
Yes Enables the SSL encryption between Exchange and Graph services. If the value is true
, the SSL encryption is enabled and the following SSL parameters take effect. If Exchange is run on a multi-machine cluster, you need to store the corresponding files in the same path on each machine when setting the following SSL-related paths. nebula.ssl.sign
string ca
Yes Specifies the SSL sign. Optional values are ca
and self
. nebula.ssl.ca.param.caCrtFilePath
string Specifies the storage path of the CA certificate. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.ca.param.crtFilePath
string \"/path/crtFilePath\"
Yes Specifies the storage path of the CRT certificate. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.ca.param.keyFilePath
string \"/path/keyFilePath\"
Yes Specifies the storage path of the key file. It takes effect when the value of nebula.ssl.sign
is ca
. nebula.ssl.self.param.crtFilePath
string \"/path/crtFilePath\"
Yes Specifies the storage path of the CRT certificate. It takes effect when the value of nebula.ssl.sign
is self
. nebula.ssl.self.param.keyFilePath
string \"/path/keyFilePath\"
Yes Specifies the storage path of the key file. It takes effect when the value of nebula.ssl.sign
is self
. nebula.ssl.self.param.password
string \"nebula\"
Yes Specifies the storage path of the password. It takes effect when the value of nebula.ssl.sign
is self
. nebula.path.local
string \"/tmp\"
No The local SST file path which needs to be set when users import SST files. nebula.path.remote
string \"/sst\"
No The remote SST file path which needs to be set when users import SST files. nebula.path.hdfs.namenode
string \"hdfs://name_node:9000\"
No The NameNode path which needs to be set when users import SST files. nebula.connection.timeout
int 3000
No The timeout set for Thrift connections. Unit: ms. nebula.connection.retry
int 3
No Retries set for Thrift connections. nebula.execution.retry
int 3
No Retries set for executing nGQL statements. nebula.error.max
int 32
No The maximum number of failures during the import process. When the number of failures reaches the maximum, the Spark job submitted will stop automatically . nebula.error.output
string /tmp/errors
No The path to output error logs. Failed nGQL statement executions are saved in the error log. nebula.rate.limit
int 1024
No The limit on the number of tokens in the token bucket when importing data. nebula.rate.timeout
int 1000
No The timeout period for getting tokens from a token bucket. Unit: milliseconds. Note
NebulaGraph doesn't support vertices without tags by default. To import vertices without tags, enable vertices without tags in the NebulaGraph cluster and then add parameter nebula.enableTagless
to the Exchange configuration with the value true
. For example:
nebula: {\n address:{\n graph:[\"127.0.0.1:9669\"]\n meta:[\"127.0.0.1:9559\"]\n }\n user: root\n pswd: nebula\n space: test\n enableTagless: true\n ......\n\n }\n
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#vertex_configurations","title":"Vertex configurations","text":"For different data sources, the vertex configurations are different. There are many general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure vertices.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#general_parameters","title":"General parameters","text":"Parameter Type Default value Required Descriptiontags.name
string - Yes The tag name defined in NebulaGraph. tags.type.source
string - Yes Specify a data source. For example, csv
. tags.type.sink
string client
Yes Specify an import method. Optional values are client
and SST
. tags.writeMode
string INSERT
No Types of batch operations on data, including batch inserts, updates, and deletes. Optional values are INSERT
, UPDATE
, DELETE
. tags.deleteEdge
string false
No Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when tags.writeMode
is DELETE
. tags.fields
list[string] - Yes The header or column name of the column corresponding to properties. If there is a header or a column name, please use that name directly. If a CSV file does not have a header, use the form of [_c0, _c1, _c2]
to represent the first column, the second column, the third column, and so on. tags.nebula.fields
list[string] - Yes Property names defined in NebulaGraph, the order of which must correspond to tags.fields
. For example, [_c1, _c2]
corresponds to [name, age]
, which means that values in the second column are the values of the property name
, and values in the third column are the values of the property age
. tags.vertex.field
string - Yes The column of vertex IDs. For example, when a CSV file has no header, users can use _c0
to indicate values in the first column are vertex IDs. tags.vertex.udf.separator
string - No Support merging multiple columns by custom rules. This parameter specifies the join character. tags.vertex.udf.oldColNames
list - No Support merging multiple columns by custom rules. This parameter specifies the names of the columns to be merged. Multiple columns are separated by commas. tags.vertex.udf.newColName
string - No Support merging multiple columns by custom rules. This parameter specifies the new column name. tags.vertex.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. tags.vertex.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. tags.batch
int 256
Yes The maximum number of vertices written into NebulaGraph in a single batch. tags.partition
int 32
Yes The number of partitions to be created when the data is written to NebulaGraph. If tags.partition \u2264 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source. tags.filter
string - No The filtering rule. The data that matches the filter rule is imported into NebulaGraph. For information about filtering formats, see Dataset."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_parquetjsonorc_data_sources","title":"Specific parameters of Parquet/JSON/ORC data sources","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of vertex data files in HDFS. Enclose the path in double quotes and start with hdfs://
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_csv_data_sources","title":"Specific parameters of CSV data sources","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of vertex data files in HDFS. Enclose the path in double quotes and start with hdfs://
. tags.separator
string ,
Yes The separator. The default value is a comma (,). For special characters, such as the control character ^A
, you can use ASCII octal \\001
or UNICODE encoded hexadecimal \\u0001
, for the control character ^B
, use ASCII octal \\002
or UNICODE encoded hexadecimal \\u0002
, for the control character ^C
, use ASCII octal \\003
or UNICODE encoded hexadecimal \\u0003
. tags.header
bool true
Yes Whether the file has a header."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_hive_data_sources","title":"Specific parameters of Hive data sources","text":"Parameter Type Default value Required Description tags.exec
string - Yes The statement to query data sources. For example, select name,age from mooc.users
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_maxcompute_data_sources","title":"Specific parameters of MaxCompute data sources","text":"Parameter Type Default value Required Description tags.table
string - Yes The table name of the MaxCompute. tags.project
string - Yes The project name of the MaxCompute. tags.odpsUrl
string - Yes The odpsUrl of the MaxCompute service. For more information about odpsUrl, see Endpoints. tags.tunnelUrl
string - Yes The tunnelUrl of the MaxCompute service. For more information about tunnelUrl, see Endpoints. tags.accessKeyId
string - Yes The accessKeyId of the MaxCompute service. tags.accessKeySecret
string - Yes The accessKeySecret of the MaxCompute service. tags.partitionSpec
string - No Partition descriptions of MaxCompute tables. tags.sentence
string - No Statements to query data sources. The table name in the SQL statement is the same as the value of the table above."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_neo4j_data_sources","title":"Specific parameters of Neo4j data sources","text":"Parameter Type Default value Required Description tags.exec
string - Yes Statements to query data sources. For example: match (n:label) return n.neo4j-field-0
. tags.server
string \"bolt://127.0.0.1:7687\"
Yes The server address of Neo4j. tags.user
string - Yes The Neo4j username with read permissions. tags.password
string - Yes The account password. tags.database
string - Yes The name of the database where source data is saved in Neo4j. tags.check_point_path
string /tmp/test
No The directory set to import progress information, which is used for resuming transfers. If not set, the resuming transfer is disabled."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_mysqlpostgresql_data_sources","title":"Specific parameters of MySQL/PostgreSQL data sources","text":"Parameter Type Default value Required Description tags.host
string - Yes The MySQL/PostgreSQL server address. tags.port
string - Yes The MySQL/PostgreSQL server port. tags.database
string - Yes The database name. tags.table
string - Yes The name of a table used as a data source. tags.user
string - Yes The MySQL/PostgreSQL username with read permissions. tags.password
string - Yes The account password. tags.sentence
string - Yes Statements to query data sources. For example: \"select teamid, name from team order by teamid\"
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_oracle_data_sources","title":"Specific parameters of Oracle data sources","text":"Parameter Type Default value Required Description tags.url
string - Yes The Oracle server address. tags.driver
string - Yes The Oracle driver address. tags.user
string - Yes The Oracle username with read permissions. tags.password
string - Yes The account password. tags.table
string - Yes The name of a table used as a data source. tags.sentence
string - Yes Statements to query data sources. For example: \"select playerid, name, age from player\"
."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_clickhouse_data_sources","title":"Specific parameters of ClickHouse data sources","text":"Parameter Type Default value Required Description tags.url
string - Yes The JDBC URL of ClickHouse. tags.user
string - Yes The ClickHouse username with read permissions. tags.password
string - Yes The account password. tags.numPartition
string - Yes The number of ClickHouse partitions. tags.sentence
string - Yes Statements to query data sources."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_hbase_data_sources","title":"Specific parameters of Hbase data sources","text":"Parameter Type Default value Required Description tags.host
string 127.0.0.1
Yes The Hbase server address. tags.port
string 2181
Yes The Hbase server port. tags.table
string - Yes The name of a table used as a data source. tags.columnFamily
string - Yes The column family to which a table belongs."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_pulsar_data_sources","title":"Specific parameters of Pulsar data sources","text":"Parameter Type Default value Required Description tags.service
string \"pulsar://localhost:6650\"
Yes The Pulsar server address. tags.admin
string \"http://localhost:8081\"
Yes The admin URL used to connect pulsar. tags.options.<topic|topics| topicsPattern>
string - Yes Options offered by Pulsar, which can be configured by choosing one from topic
, topics
, and topicsPattern
. tags.interval.seconds
int 10
Yes The interval for reading messages. Unit: seconds."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_of_kafka_data_sources","title":"Specific parameters of Kafka data sources","text":"Parameter Type Default value Required Description tags.service
string - Yes The Kafka server address. tags.topic
string - Yes The message type. tags.interval.seconds
int 10
Yes The interval for reading messages. Unit: seconds. tags.securityProtocol
string - No Kafka security protocol. tags.mechanism
string - No The security certification mechanism provided by SASL of Kafka. tags.kerberos
bool false
No Whether to enable Kerberos for security certification. If tags.mechanism
is kerberos
, this parameter must be set to true
. tags.kerberosServiceName
string - No Kerberos service name. If tags.kerberos
is true
, this parameter must be set."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_for_generating_sst_files","title":"Specific parameters for generating SST files","text":"Parameter Type Default value Required Description tags.path
string - Yes The path of the source file specified to generate SST files. tags.repartitionWithNebula
bool true
No Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file. Enabling this function can reduce the time required to DOWNLOAD and INGEST SST files."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#edge_configurations","title":"Edge configurations","text":"For different data sources, configurations of edges are also different. There are general parameters and some specific parameters. General parameters and specific parameters of different data sources need to be configured when users configure edges.
For the specific parameters of different data sources for edge configurations, please refer to the introduction of specific parameters of different data sources above, and pay attention to distinguishing tags and edges.
"},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#general_parameters_1","title":"General parameters","text":"Parameter Type Default value Required Descriptionedges.name
string - Yes The edge type name defined in NebulaGraph. edges.type.source
string - Yes The data source of edges. For example, csv
. edges.type.sink
string client
Yes The method specified to import data. Optional values are client
and SST
. edges.writeMode
string INSERT
No Types of batch operations on data, including batch inserts, updates, and deletes. Optional values are INSERT
, UPDATE
, DELETE
. edges.fields
list[string] - Yes The header or column name of the column corresponding to properties. If there is a header or column name, please use that name directly. If a CSV file does not have a header, use the form of [_c0, _c1, _c2]
to represent the first column, the second column, the third column, and so on. edges.nebula.fields
list[string] - Yes Edge names defined in NebulaGraph, the order of which must correspond to edges.fields
. For example, [_c2, _c3]
corresponds to [start_year, end_year]
, which means that values in the third column are the values of the start year, and values in the fourth column are the values of the end year. edges.source.field
string - Yes The column of source vertices of edges. For example, _c0
indicates a value in the first column that is used as the source vertex of an edge. edges.source.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. edges.source.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. edges.target.field
string - Yes The column of destination vertices of edges. For example, _c0
indicates a value in the first column that is used as the destination vertex of an edge. edges.target.prefix
string - No Add the specified prefix to the VID. For example, if the VID is 12345
, adding the prefix tag1
will result in tag1_12345
. The underscore cannot be modified. edges.target.policy
string - No Supports only the value hash
. Performs hashing operations on VIDs of type string. edges.ranking
int - No The column of rank values. If not specified, all rank values are 0
by default. edges.batch
int 256
Yes The maximum number of edges written into NebulaGraph in a single batch. edges.partition
int 32
Yes The number of partitions to be created when the data is written to NebulaGraph. If edges.partition \u2264 1
, the number of partitions to be created in NebulaGraph is the same as that in the data source. edges.filter
string - No The filtering rule. The data that matches the filter rule is imported into NebulaGraph. For information about filtering formats, see Dataset."},{"location":"import-export/nebula-exchange/parameter-reference/ex-ug-parameter/#specific_parameters_for_generating_sst_files_1","title":"Specific parameters for generating SST files","text":"Parameter Type Default value Required Description edges.path
string - Yes The path of the source file specified to generate SST files. edges.repartitionWithNebula
bool true
No Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file. Enabling this function can reduce the time required to DOWNLOAD and INGEST SST files."},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/","title":"Import data from ClickHouse","text":"This topic provides an example of how to use Exchange to import data stored on ClickHouse into NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set ClickHouse data source configuration. In this example, the copied file is called clickhouse_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n# NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to ClickHouse.\n source: clickhouse\n # Specify how to import the data of vertexes into NebulaGraph: Client or SST.\n sink: client\n }\n\n # JDBC URL of ClickHouse\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n\n user:\"user\"\n password:\"123456\"\n\n # The number of ClickHouse partitions\n numPartition:\"5\"\n\n sentence:\"select * from player\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [name,age]\n nebula.fields: [name,age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: clickhouse\n sink: client\n }\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n user:\"user\"\n password:\"123456\"\n numPartition:\"5\"\n sentence:\"select * from team\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:teamid\n }\n batch: 256\n partition: 32\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to ClickHouse.\n source: clickhouse\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # JDBC URL of ClickHouse\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n\n user:\"user\"\n password:\"123456\"\n\n # The number of ClickHouse partitions.\n numPartition:\"5\"\n\n sentence:\"select * from follow\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertexes.\n source: {\n field:src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # In target, use a column in the follow table as the source of the edge's destination vertexes.\n target: {\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: clickhouse\n sink: client\n }\n url:\"jdbc:clickhouse://192.168.*.*:8123/basketballplayer\"\n user:\"user\"\n password:\"123456\"\n numPartition:\"5\"\n sentence:\"select * from serve\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field:playerid\n }\n target: {\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import ClickHouse data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <clickhouse_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/clickhouse_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-clickhouse/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/","title":"Import data from CSV files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local CSV files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#step_2_process_csv_files","title":"Step 2: Process CSV files","text":"Confirm the following information:
Process CSV files to meet Schema requirements.
Note
Exchange supports uploading CSV files with or without headers.
Obtain the CSV file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set CSV data source configuration. In this example, the copied file is called csv_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example: \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example: \"file:///tmp/xx.csv\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has headers, use the actual column names.\n fields: [_c1, _c2]\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The value of vertex must be the same as the column names in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:_c0\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: csv\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.csv\"\n fields: [_c1]\n nebula.fields: [name]\n vertex: {\n field:_c0\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example: \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example: \"file:///tmp/xx.csv\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has headers, use the actual column names.\n fields: [_c2]\n\n # Specify the column names in the edge table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be the same as the column names in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: _c0\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: _c1\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # Specify a column as the source of the rank (optional).\n\n #ranking: rank\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: csv\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.csv\"\n fields: [_c2,_c3]\n nebula.fields: [start_year, end_year]\n source: {\n field: _c0\n }\n target: {\n field: _c1\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-csv/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import CSV data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <csv_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/csv_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/","title":"Import data from HBase","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HBase.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in HBase. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
hbase(main):002:0> scan \"player\"\nROW COLUMN+CELL\n player100 column=cf:age, timestamp=1618881347530, value=42\n player100 column=cf:name, timestamp=1618881354604, value=Tim Duncan\n player101 column=cf:age, timestamp=1618881369124, value=36\n player101 column=cf:name, timestamp=1618881379102, value=Tony Parker\n player102 column=cf:age, timestamp=1618881386987, value=33\n player102 column=cf:name, timestamp=1618881393370, value=LaMarcus Aldridge\n player103 column=cf:age, timestamp=1618881402002, value=32\n player103 column=cf:name, timestamp=1618881407882, value=Rudy Gay\n ...\n\nhbase(main):003:0> scan \"team\"\nROW COLUMN+CELL\n team200 column=cf:name, timestamp=1618881445563, value=Warriors\n team201 column=cf:name, timestamp=1618881453636, value=Nuggets\n ...\n\nhbase(main):004:0> scan \"follow\"\nROW COLUMN+CELL\n player100 column=cf:degree, timestamp=1618881804853, value=95\n player100 column=cf:dst_player, timestamp=1618881791522, value=player101\n player101 column=cf:degree, timestamp=1618881824685, value=90\n player101 column=cf:dst_player, timestamp=1618881816042, value=player102\n ...\n\nhbase(main):005:0> scan \"serve\"\nROW COLUMN+CELL\n player100 column=cf:end_year, timestamp=1618881899333, value=2016\n player100 column=cf:start_year, timestamp=1618881890117, value=1997\n player100 column=cf:teamid, timestamp=1618881875739, value=team204\n ...\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set HBase data source configuration. In this example, the copied file is called hbase_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set information about Tag player.\n # If you want to set RowKey as the data source, enter rowkey and the actual column name of the column family.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to HBase.\n source: hbase\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"player\"\n columnFamily:\"cf\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # For example, if rowkey is the source of the VID, enter rowkey.\n vertex:{\n field:rowkey\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # Number of pieces of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set Tag Team information.\n {\n name: team\n type: {\n source: hbase\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"team\"\n columnFamily:\"cf\"\n fields: [name]\n nebula.fields: [name]\n vertex:{\n field:rowkey\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to HBase.\n source: hbase\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:2181\n table:\"follow\"\n columnFamily:\"cf\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source:{\n field:rowkey\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n\n target:{\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: hbase\n sink: client\n }\n host:192.168.*.*\n port:2181\n table:\"serve\"\n columnFamily:\"cf\"\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source:{\n field:rowkey\n }\n\n target:{\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import HBase data into NebulaGraph. For descriptions of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <hbase_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/hbase_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hbase/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/","title":"Import data from Hive","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Hive.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in Hive. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
scala> spark.sql(\"describe basketball.player\").show\n+--------+---------+-------+\n|col_name|data_type|comment|\n+--------+---------+-------+\n|playerid| string| null|\n| age| bigint| null|\n| name| string| null|\n+--------+---------+-------+\n\nscala> spark.sql(\"describe basketball.team\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n| teamid| string| null|\n| name| string| null|\n+----------+---------+-------+\n\nscala> spark.sql(\"describe basketball.follow\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n|src_player| string| null|\n|dst_player| string| null|\n| degree| bigint| null|\n+----------+---------+-------+\n\nscala> spark.sql(\"describe basketball.serve\").show\n+----------+---------+-------+\n| col_name|data_type|comment|\n+----------+---------+-------+\n| playerid| string| null|\n| teamid| string| null|\n|start_year| bigint| null|\n| end_year| bigint| null|\n+----------+---------+-------+\n
Note
The Hive data type bigint
corresponds to the NebulaGraph int
.
This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_2_use_spark_sql_to_confirm_hive_sql_statements","title":"Step 2: Use Spark SQL to confirm Hive SQL statements","text":"After the Spark-shell environment is started, run the following statements to ensure that Spark can read data in Hive.
scala> sql(\"select playerid, age, name from basketball.player\").show\nscala> sql(\"select teamid, name from basketball.team\").show\nscala> sql(\"select src_player, dst_player, degree from basketball.follow\").show\nscala> sql(\"select playerid, teamid, start_year, end_year from basketball.serve\").show\n
The following is the result read from the table basketball.player
.
+---------+----+-----------------+\n| playerid| age| name|\n+---------+----+-----------------+\n|player100| 42| Tim Duncan|\n|player101| 36| Tony Parker|\n|player102| 33|LaMarcus Aldridge|\n|player103| 32| Rudy Gay|\n|player104| 32| Marco Belinelli|\n+---------+----+-----------------+\n...\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_3_modify_configuration_file","title":"Step 3: Modify configuration file","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Hive data source configuration. In this example, the copied file is called hive_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # If Spark and Hive are deployed in different clusters, you need to configure the parameters for connecting to Hive. Otherwise, skip these configurations.\n #hive: {\n # waredir: \"hdfs://NAMENODE_IP:9000/apps/svr/hive-xxx/warehouse/\"\n # connectionURL: \"jdbc:mysql://your_ip:3306/hive_spark?characterEncoding=UTF-8\"\n # connectionDriverName: \"com.mysql.jdbc.Driver\"\n # connectionUserName: \"user\"\n # connectionPassword: \"password\"\n #}\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Hive.\n source: hive\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Set the SQL statement to read the data of player table in basketball database.\n exec: \"select playerid, age, name from basketball.player\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex:{\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: hive\n sink: client\n }\n exec: \"select teamid, name from basketball.team\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Hive.\n source: hive\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Set the SQL statement to read the data of follow table in the basketball database.\n exec: \"select src_player, dst_player, degree from basketball.follow\"\n\n # Specify the column names in the follow table in Fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's starting vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: hive\n sink: client\n }\n exec: \"select playerid, teamid, start_year, end_year from basketball.serve\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import Hive data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <hive_application.conf_path> -h\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/hive_application.conf -h\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-hive/#step_6_optional_rebuild_indexes_in_nebulagraph","title":"Step 6: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/","title":"Import data from general JDBC","text":"JDBC data refers to the data of various databases accessed through the JDBC interface. This topic provides an example of how to use Exchange to export MySQL data and import to NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in MySQL. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
mysql> desc player;\n+----------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+----------+-------------+------+-----+---------+-------+\n| playerid | int | YES | | NULL | |\n| age | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+----------+-------------+------+-----+---------+-------+\n\nmysql> desc team;\n+--------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+--------+-------------+------+-----+---------+-------+\n| teamid | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+--------+-------------+------+-----+---------+-------+\n\nmysql> desc follow;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| src_player | int | YES | | NULL | |\n| dst_player | int | YES | | NULL | |\n| degree | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n\nmysql> desc serve;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| playerid | int | YES | | NULL | |\n| teamid | int | YES | | NULL | |\n| start_year | int | YES | | NULL | |\n| end_year | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set JDBC data source configuration. In this case, the copied file is called jdbc_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to JDBC.\n source: jdbc\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # URL of the JDBC data source. The example is MySql database.\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n\n # JDBC driver \n driver:\"com.mysql.cj.jdbc.Driver\"\n\n # Database user name and password\n user:\"root\"\n password:\"12345\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter, and can additionally configure sentence.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # nebula-exchange_spark_2.2 can configure this parameter. Multi-table queries are not supported. Only the table name needs to be written after from. The form `db.table` is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence:\"select playerid, age, name from player, team order by playerid\"\n\n # (optional)Multiple connections read parameters. See https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html\n partitionColumn:playerid # optional. Must be a numeric, date, or timestamp column from the table in question.\n lowerBound:1 # optional\n upperBound:5 # optional\n numPartitions:5 # optional\n\n\n fetchSize:2 # The JDBC fetch size, which determines how many rows to fetch per round trip.\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: jdbc\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n table:team\n sentence:\"select teamid, name from team order by teamid\"\n partitionColumn:teamid \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2 \n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to JDBC.\n source: jdbc\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter, and can additionally configure sentence.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # nebula-exchange_spark_2.2 can configure this parameter. Multi-table queries are not supported. Only the table name needs to be written after from. The form `db.table` is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence:\"select src_player,dst_player,degree from follow order by src_player\"\n\n partitionColumn:src_player \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2 \n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: jdbc\n sink: client\n }\n\n url:\"jdbc:mysql://127.0.0.1:3306/basketball?useUnicode=true&characterEncoding=utf-8\"\n driver:\"com.mysql.cj.jdbc.Driver\"\n user:root\n password:\"12345\"\n table:serve\n sentence:\"select playerid,teamid,start_year,end_year from serve order by playerid\"\n partitionColumn:playerid \n lowerBound:1 \n upperBound:5 \n numPartitions:5 \n fetchSize:2\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import general JDBC data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <jdbc_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/jdbc_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-jdbc/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/","title":"Import data from JSON files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local JSON files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example. Some sample data are as follows:
player
{\"id\":\"player100\",\"age\":42,\"name\":\"Tim Duncan\"}\n{\"id\":\"player101\",\"age\":36,\"name\":\"Tony Parker\"}\n{\"id\":\"player102\",\"age\":33,\"name\":\"LaMarcus Aldridge\"}\n{\"id\":\"player103\",\"age\":32,\"name\":\"Rudy Gay\"}\n...\n
team
{\"id\":\"team200\",\"name\":\"Warriors\"}\n{\"id\":\"team201\",\"name\":\"Nuggets\"}\n...\n
follow
{\"src\":\"player100\",\"dst\":\"player101\",\"degree\":95}\n{\"src\":\"player101\",\"dst\":\"player102\",\"degree\":90}\n...\n
serve
{\"src\":\"player100\",\"dst\":\"team204\",\"start_year\":\"1997\",\"end_year\":\"2016\"}\n{\"src\":\"player101\",\"dst\":\"team204\",\"start_year\":\"1999\",\"end_year\":\"2018\"}\n...\n
This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_2_process_json_files","title":"Step 2: Process JSON files","text":"Confirm the following information:
Process JSON files to meet Schema requirements.
Obtain the JSON file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set JSON data source configuration. In this example, the copied file is called json_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\" \n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to JSON.\n source: json\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the JSON file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.json\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.json\"\n\n # Specify the key name in the JSON file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The value of vertex must be the same as that in the JSON file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n{\n name: team\n type: {\n source: json\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.json\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to JSON.\n source: json\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the JSON file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.json\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.json\"\n\n # Specify the key name in the JSON file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n\n # Specify the column names in the edge table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be the same as that in the JSON file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: json\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.json\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import JSON data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <json_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-echange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/json_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/","title":"Import data from Kafka","text":"This topic provides a simple guide to importing Data stored on Kafka into NebulaGraph using Exchange.
Compatibility
Please use Exchange 3.5.0/3.3.0/3.0.0 when importing Kafka data. In version 3.4.0, caching of imported data was added, and streaming data import is not supported.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.The following JAR files have been downloaded and placed in the directory SPARK_HOME/jars
of Spark:
tags.type.sink
and edges.type.sink
is client
.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"Note
If some data is stored in Kafka's value field, you need to modify the source code, get the value from Kafka, parse the value through the from_JSON function, and return it as a Dataframe.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set Kafka data source configuration. In this example, the copied file is called kafka_application.conf
. For details on each configuration item, see Parameters in the configuration file.
Note
When importing Kafka data, a configuration file can only handle one tag or edge type. If there are multiple tag or edge types, you need to create multiple configuration files.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n\n # The corresponding Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Kafka.\n source: kafka\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n # Kafka server address.\n service: \"127.0.0.1:9092\"\n # Message category.\n topic: \"topic_name1\"\n\n # If Kafka uses Kerberos for security certification, the following parameters need to be set. If Kafka uses SASL or SASL_PLAINTEXT for security certification, you do not need to set kerberos or kerberosServiceName.\n #securityProtocol: SASL_PLAINTEXT\n #mechanism: GASSAPI\n #kerberos: true\n #kerberosServiceName: kafka\n\n # Kafka data has a fixed domain name: key, value, topic, partition, offset, timestamp, timestampType.\n # If multiple fields need to be specified after Spark reads as DataFrame, separate them with commas.\n # Specify the field name in fields. For example, use key for name in NebulaGraph and value for age in Nebula, as shown in the following.\n fields: [key,value]\n nebula.fields: [name,age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n # The key is the same as the value above, indicating that key is used as both VID and property name.\n vertex:{\n field:key\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n # The consumer offsets. The default value is latest. Optional value are latest and earliest.\n startingOffsets: latest\n # Flow control, with a rate limit on the maximum offset processed per trigger interval, may not be configured.\n # maxOffsetsPerTrigger:10000\n }\n ]\n\n # Processing edges\n #edges: [\n # # Set the information about the Edge Type follow.\n # {\n # # The corresponding Edge Type name in NebulaGraph.\n # name: follow\n\n # type: {\n # # Specify the data source file format to Kafka.\n # source: kafka\n\n # # Specify how to import the Edge type data into NebulaGraph.\n # # Specify how to import the data into NebulaGraph. Only client is supported.\n # sink: client\n # }\n\n # # Kafka server address.\n # service: \"127.0.0.1:9092\"\n # # Message category.\n # topic: \"topic_name3\"\n\n # # If Kafka uses Kerberos for security certification, the following parameters need to be set. If Kafka uses SASL or SASL_PLAINTEXT for security certification, you do not need to set kerberos or kerberosServiceName.\n # #securityProtocol: SASL_PLAINTEXT\n # #mechanism: GASSAPI\n # #kerberos: true\n # #kerberosServiceName: kafka\n\n # # Kafka data has a fixed domain name: key, value, topic, partition, offset, timestamp, timestampType.\n # # If multiple fields need to be specified after Spark reads as DataFrame, separate them with commas.\n # # Specify the field name in fields. For example, use key for degree in Nebula, as shown in the following.\n # fields: [key]\n # nebula.fields: [degree]\n\n # # In source, use a column in the topic as the source of the edge's source vertex.\n # # In target, use a column in the topic as the source of the edge's destination vertex.\n # source:{\n # field:timestamp\n # # udf:{\n # # separator:\"_\"\n # # oldColNames:[field-0,field-1,field-2]\n # # newColName:new-field\n # # }\n # # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # # prefix:\"tag1\"\n # # Performs hashing operations on VIDs of type string.\n # # policy:hash\n # }\n\n\n # target:{\n # field:offset\n # # udf:{\n # # separator:\"_\"\n # # oldColNames:[field-0,field-1,field-2]\n # # newColName:new-field\n # # }\n # # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # # prefix:\"tag1\"\n # # Performs hashing operations on VIDs of type string.\n # # policy:hash\n # }\n\n # # (Optional) Specify a column as the source of the rank.\n # #ranking: rank\n\n # # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n # #writeMode: INSERT\n\n # # The number of data written to NebulaGraph in a single batch.\n # batch: 10\n\n # # The number of partitions to be created when the data is written to NebulaGraph.\n # partition: 10\n\n # # The interval for message reading. Unit: second.\n # interval.seconds: 10\n # # The consumer offsets. The default value is latest. Optional value are latest and earliest.\n # startingOffsets: latest\n # # Flow control, with a rate limit on the maximum offset processed per trigger interval, may not be configured.\n # # maxOffsetsPerTrigger:10000\n # }\n #]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Kafka data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <kafka_application.conf_path>\n
Note
Example:
No security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
Enable Kerberos security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf -Djava.security.krb5.conf=/path/krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf -Djava.security.krb5.conf=/path/krb5.conf\" \\\n--files /local/path/kafka_client_jaas.conf,/local/path/kafka.keytab,/local/path/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
Enable SASL/SASL_PLAINTEXT security certification
${SPARK_HOME}/bin/spark-submit --master \"local\" \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/kafka_client_jaas.conf\" \\\n--files /local/path/kafka_client_jaas.conf \\\n--class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar \\\n-c /root/nebula-exchange/target/classes/kafka_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-kafka/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/","title":"Import data from MaxCompute","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in MaxCompute.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set MaxCompute data source configuration. In this example, the copied file is called maxcompute_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to MaxCompute.\n source: maxcompute\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Table name of MaxCompute.\n table:player\n\n # Project name of MaxCompute.\n project:project\n\n # OdpsUrl and tunnelUrl for the MaxCompute service.\n # The address is https://help.aliyun.com/document_detail/34951.html.\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n\n # AccessKeyId and accessKeySecret of the MaxCompute service.\n accessKeyId:xxx\n accessKeySecret:xxx\n\n # Partition description of the MaxCompute table. This configuration is optional.\n partitionSpec:\"dt='partition1'\"\n\n # Ensure that the table name in the SQL statement is the same as the value of the table above. This configuration is optional.\n sentence:\"select id, name, age, playerid from player where id < 10\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields:[name, age]\n nebula.fields:[name, age]\n\n # Specify a column of data in the table as the source of vertex VID in the NebulaGraph.\n vertex:{\n field: playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: maxcompute\n sink: client\n }\n table:team\n project:project\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n accessKeyId:xxx\n accessKeySecret:xxx\n partitionSpec:\"dt='partition1'\"\n sentence:\"select id, name, teamid from team where id < 10\"\n fields:[name]\n nebula.fields:[name]\n vertex:{\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type:{\n # Specify the data source file format to MaxCompute.\n source:maxcompute\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink:client\n }\n\n # Table name of MaxCompute.\n table:follow\n\n # Project name of MaxCompute.\n project:project\n\n # OdpsUrl and tunnelUrl for MaxCompute service.\n # The address is https://help.aliyun.com/document_detail/34951.html.\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n\n # AccessKeyId and accessKeySecret of the MaxCompute service.\n accessKeyId:xxx\n accessKeySecret:xxx\n\n # Partition description of the MaxCompute table. This configuration is optional.\n partitionSpec:\"dt='partition1'\"\n\n # Ensure that the table name in the SQL statement is the same as the value of the table above. This configuration is optional.\n sentence:\"select * from follow\"\n\n # Specify the column names in the follow table in Fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields:[degree]\n nebula.fields:[degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n source:{\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n target:{\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition:10\n\n # The number of data written to NebulaGraph in a single batch.\n batch:10\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type:{\n source:maxcompute\n sink:client\n }\n table:serve\n project:project\n odpsUrl:\"http://service.cn-hangzhou.maxcompute.aliyun.com/api\"\n tunnelUrl:\"http://dt.cn-hangzhou.maxcompute.aliyun.com\"\n accessKeyId:xxx\n accessKeySecret:xxx\n partitionSpec:\"dt='partition1'\"\n sentence:\"select * from serve\"\n fields:[start_year,end_year]\n nebula.fields:[start_year,end_year]\n source:{\n field: playerid\n }\n target:{\n field: teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n partition:10\n batch:10\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-maxcompute/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import MaxCompute data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <maxcompute_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/maxcompute_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/","title":"Import data from MySQL/PostgreSQL","text":"This topic provides an example of how to use Exchange to export MySQL data and import to NebulaGraph. It also applies to exporting data from PostgreSQL into NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in MySQL. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
mysql> desc player;\n+----------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+----------+-------------+------+-----+---------+-------+\n| playerid | varchar(30) | YES | | NULL | |\n| age | int | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+----------+-------------+------+-----+---------+-------+\n\nmysql> desc team;\n+--------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+--------+-------------+------+-----+---------+-------+\n| teamid | varchar(30) | YES | | NULL | |\n| name | varchar(30) | YES | | NULL | |\n+--------+-------------+------+-----+---------+-------+\n\nmysql> desc follow;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| src_player | varchar(30) | YES | | NULL | |\n| dst_player | varchar(30) | YES | | NULL | |\n| degree | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n\nmysql> desc serve;\n+------------+-------------+------+-----+---------+-------+\n| Field | Type | Null | Key | Default | Extra |\n+------------+-------------+------+-----+---------+-------+\n| playerid | varchar(30) | YES | | NULL | |\n| teamid | varchar(30) | YES | | NULL | |\n| start_year | int | YES | | NULL | |\n| end_year | int | YES | | NULL | |\n+------------+-------------+------+-----+---------+-------+\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.SPARK_HOME/jars
of Spark.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set MySQL data source configuration. In this case, the copied file is called mysql_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to MySQL.\n source: mysql\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n user:\"test\"\n password:\"123456\"\n database:\"basketball\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from people, player, team\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: mysql\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n database:\"basketball\"\n table:\"team\"\n user:\"test\"\n password:\"123456\"\n sentence:\"select teamid, name from team order by teamid;\"\n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to MySQL.\n source: mysql\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n user:\"test\"\n password:\"123456\"\n database:\"basketball\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from follow, serve\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: mysql\n sink: client\n }\n\n host:192.168.*.*\n port:3306\n database:\"basketball\"\n table:\"serve\"\n user:\"test\"\n password:\"123456\"\n sentence:\"select playerid,teamid,start_year,end_year from serve order by playerid;\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import MySQL data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <mysql_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/mysql_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-mysql/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/","title":"Import data from Neo4j","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Neo4j.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#implementation_method","title":"Implementation method","text":"Exchange uses Neo4j Driver 4.0.1 to read Neo4j data. Before batch export, you need to write Cypher statements that are automatically executed based on labels and relationship types and the number of Spark partitions in the configuration file to improve data export performance.
When Exchange reads Neo4j data, it needs to do the following:
The Reader in Exchange replaces the statement following the Cypher RETURN
statement in the exec
part of the configuration file with COUNT(*)
, and executes this statement to get the total amount of data, then calculates the starting offset and size of each partition based on the number of Spark partitions.
(Optional) If the user has configured the check_point_path
directory, Reader reads the files in the directory. In the transferring state, Reader calculates the offset and size that each Spark partition should have.
In each Spark partition, the Reader in Exchange adds different SKIP
and LIMIT
statements to the Cypher statement and calls the Neo4j Driver for parallel execution to distribute data to different Spark partitions.
The Reader finally processes the returned data into a DataFrame.
At this point, Exchange has finished exporting the Neo4j data. The data is then written in parallel to the NebulaGraph database.
The whole process is illustrated below.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Hardware specifications:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#step_2_configuring_source_data","title":"Step 2: Configuring source data","text":"To speed up the export of Neo4j data, create indexes for the corresponding properties in the Neo4j database. For more information, refer to the Neo4j manual.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#step_3_modify_configuration_files","title":"Step 3: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Neo4j data source configuration. In this example, the copied file is called neo4j_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n space: basketballplayer\n\n connection: {\n timeout: 3000\n retry: 3\n }\n\n execution: {\n retry: 3\n }\n\n error: {\n max: 32\n output: /tmp/errors\n }\n\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n\n\n # Set the information about the Tag player\n {\n name: player\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n # bolt 3 does not support multiple databases, do not configure database names. 4 and above can configure database names.\n # database:neo4j\n exec: \"match (n:player) return n.id as id, n.age as age, n.name as name\"\n fields: [age,name]\n nebula.fields: [age,name]\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n # Set the information about the Tag Team\n {\n name: team\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n database:neo4j\n exec: \"match (n:team) return n.id as id,n.name as name\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow\n {\n name: follow\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n # bolt 3 does not support multiple databases, do not configure database names. 4 and above can configure database names.\n # database:neo4j\n exec: \"match (a:player)-[r:follow]->(b:player) return a.id as src, b.id as dst, r.degree as degree order by id(r)\"\n fields: [degree]\n nebula.fields: [degree]\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n # Set the information about the Edge Type serve\n {\n name: serve\n type: {\n source: neo4j\n sink: client\n }\n server: \"bolt://192.168.*.*:7687\"\n user: neo4j\n password:neo4j\n database:neo4j\n exec: \"match (a:player)-[r:serve]->(b:team) return a.id as src, b.id as dst, r.start_year as start_year, r.end_year as end_year order by id(r)\"\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n #ranking: rank\n partition: 10\n batch: 1000\n check_point_path: /tmp/test\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#exec_configuration","title":"Exec configuration","text":"When configuring either the tags.exec
or edges.exec
parameters, you need to fill in the Cypher query. To prevent loss of data during import, it is strongly recommended to include ORDER BY
clause in Cypher queries. Meanwhile, in order to improve data import efficiency, it is better to select indexed properties for ordering. If there is no index, users can also observe the default order and select the appropriate properties for ordering to improve efficiency. If the pattern of the default order cannot be found, users can order them by the ID of the vertex or relationship and set the partition
to a small value to reduce the ordering pressure of Neo4j.
Note
Using the ORDER BY
clause lengthens the data import time.
Exchange needs to execute different SKIP
and LIMIT
Cypher statements on different Spark partitions, so SKIP
and LIMIT
clauses cannot be included in the Cypher statements corresponding to tags.exec
and edges.exec
.
NebulaGraph uses ID as the unique primary key when creating vertexes and edges, overwriting the data in that primary key if it already exists. So, if a Neo4j property value is given as the NebulaGraph'S ID and the value is duplicated in Neo4j, duplicate IDs will be generated. One and only one of their corresponding data will be stored in the NebulaGraph, and the others will be overwritten. Because the data import process is concurrently writing data to NebulaGraph, the final saved data is not guaranteed to be the latest data in Neo4j.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-neo4j/#check_point_path_configuration","title":"check_point_path configuration","text":"If breakpoint transfers are enabled, to avoid data loss, the state of the database should not change between the breakpoint and the transfer. For example, data cannot be added or deleted, and the partition
quantity configuration should not be changed.
Run the following command to import Neo4j data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <neo4j_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/neo4j_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/","title":"Import data from Oracle","text":"This topic provides an example of how to use Exchange to export Oracle data and import to NebulaGraph.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
In this example, the data set has been stored in Oracle. All vertexes and edges are stored in the player
, team
, follow
, and serve
tables. The following are some of the data for each table.
oracle> desc player;\n+-----------+-------+---------------+ \n| Column | Null | Type |\n+-----------+-------+---------------+ \n| PLAYERID | - | VARCHAR2(30) |\n| NAME | - | VARCHAR2(30) |\n| AGE | - | NUMBER |\n+-----------+-------+---------------+ \n\noracle> desc team;\n+-----------+-------+---------------+ \n| Column | Null | Type |\n+-----------+-------+---------------+ \n| TEAMID | - | VARCHAR2(30) |\n| NAME | - | VARCHAR2(30) |\n+-----------+-------+---------------+ \n\noracle> desc follow;\n+-------------+-------+---------------+ \n| Column | Null | Type |\n+-------------+-------+---------------+ \n| SRC_PLAYER | - | VARCHAR2(30) |\n| DST_PLAYER | - | VARCHAR2(30) |\n| DEGREE | - | NUMBER |\n+-------------+-------+---------------+ \n\noracle> desc serve;\n+------------+-------+---------------+ \n| Column | Null | Type |\n+------------+-------+---------------+ \n| PLAYERID | - | VARCHAR2(30) |\n| TEAMID | - | VARCHAR2(30) |\n| START_YEAR | - | NUMBER |\n| END_YEAR | - | NUMBER |\n+------------+-------+---------------+ \n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.nebula-exchange_spark_2.2 supports only single table queries, not multi-table queries.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#steps","title":"Steps","text":""},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_1_create_the_schema_in_nebulagraph","title":"Step 1: Create the Schema in NebulaGraph","text":"Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Oracle data source configuration. In this case, the copied file is called oracle_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # The Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Oracle.\n source: oracle\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.player\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from people, player, team\"\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex: {\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: oracle\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n table: \"basketball.team\"\n sentence: \"select teamid, name from team\"\n\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Oracle.\n source: oracle\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n\n # Scanning a single table to read data.\n # nebula-exchange_spark_2.2 must configure this parameter. Sentence is not supported.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as sentence.\n table:\"basketball.follow\"\n\n # Use query statement to read data.\n # This parameter is not supported by nebula-exchange_spark_2.2.\n # nebula-exchange_spark_2.4 and nebula-exchange_spark_3.0 can configure this parameter, but not at the same time as table. Multi-table queries are supported.\n # sentence: \"select * from follow, serve\"\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source: {\n field: src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target: {\n field: dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: oracle\n sink: client\n }\n\n url:\"jdbc:oracle:thin:@host:1521:basketball\"\n driver: \"oracle.jdbc.driver.OracleDriver\"\n user: \"root\"\n password: \"123456\"\n table: \"basketball.serve\"\n sentence: \"select playerid, teamid, start_year, end_year from serve\"\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source: {\n field: playerid\n }\n target: {\n field: teamid\n }\n batch: 256\n partition: 32\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Oracle data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <oracle_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/oracle_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS command to view statistics.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-oracle/#step_5_optional_rebuild_indexes_in_nebulagraph","title":"Step 5: (optional) Rebuild indexes in NebulaGraph","text":"With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/","title":"Import data from ORC files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local ORC files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#step_2_process_orc_files","title":"Step 2: Process ORC files","text":"Confirm the following information:
Process ORC files to meet Schema requirements.
Obtain the ORC file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set ORC data source configuration. In this example, the copied file is called orc_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n name: player\n type: {\n # Specify the data source file format to ORC.\n source: orc\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the ORC file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.orc\".\n path: \"hdfs://192.168.*.*:9000/data/vertex_player.orc\"\n\n # Specify the key name in the ORC file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [age,name]\n\n # Specify the property names defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n # The value of vertex must be consistent with the field in the ORC file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag team.\n {\n name: team\n type: {\n source: orc\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/vertex_team.orc\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to ORC.\n source: orc\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the ORC file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.orc\".\n path: \"hdfs://192.168.*.*:9000/data/edge_follow.orc\"\n\n # Specify the key name in the ORC file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [degree]\n\n # Specify the property names defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The value of vertex must be consistent with the field in the ORC file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge type serve.\n {\n name: serve\n type: {\n source: orc\n sink: client\n }\n path: \"hdfs://192.168.*.*:9000/data/edge_serve.orc\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-orc/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import ORC data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <orc_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/orc_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/","title":"Import data from Parquet files","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in HDFS or local Parquet files.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#data_set","title":"Data set","text":"This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space.\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer.\nnebula> USE basketballplayer;\n\n## Create the Tag player.\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team.\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow.\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve.\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#step_2_process_parquet_files","title":"Step 2: Process Parquet files","text":"Confirm the following information:
Process Parquet files to meet Schema requirements.
Obtain the Parquet file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set Parquet data source configuration. In this example, the copied file is called parquet_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n executor: {\n memory:1G\n }\n\n cores: {\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n\n # Processing vertexes\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Parquet.\n source: parquet\n\n # Specifies how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the Parquet file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.parquet\".\n path: \"hdfs://192.168.*.13:9000/data/vertex_player.parquet\"\n\n # Specify the key name in the Parquet file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [age,name]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n # The value of vertex must be consistent with the field in the Parquet file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:id\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Tag team.\n {\n name: team\n type: {\n source: parquet\n sink: client\n }\n path: \"hdfs://192.168.11.13:9000/data/vertex_team.parquet\"\n fields: [name]\n nebula.fields: [name]\n vertex: {\n field:id\n }\n batch: 256\n partition: 32\n }\n\n\n # If more vertexes need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # Specify the Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to Parquet.\n source: parquet\n\n # Specifies how to import the data into NebulaGraph: Client or SST.\n sink: client\n }\n\n # Specify the path to the Parquet file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://ip:port/xx/xx\".\n # If the file is stored locally, use double quotation marks to enclose the file path, starting with file://. For example, \"file:///tmp/xx.parquet\".\n path: \"hdfs://192.168.11.13:9000/data/edge_follow.parquet\"\n\n # Specify the key name in the Parquet file in fields, and its corresponding value will serve as the data source for the properties specified in the NebulaGraph.\n # If multiple values need to be specified, separate them with commas.\n fields: [degree]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertexes.\n # The values of vertex must be consistent with the fields in the Parquet file.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: src\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: dst\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n }\n\n # Set the information about the Edge type serve.\n {\n name: serve\n type: {\n source: parquet\n sink: client\n }\n path: \"hdfs://192.168.11.13:9000/data/edge_serve.parquet\"\n fields: [start_year,end_year]\n nebula.fields: [start_year, end_year]\n source: {\n field: src\n }\n target: {\n field: dst\n }\n batch: 256\n partition: 32\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-parquet/#step_4_import_data_into_nebulagraph","title":"Step 4: Import data into NebulaGraph","text":"Run the following command to import Parquet data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <parquet_application.conf_path> \n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/parquet_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
When using Kerberos for security certification, you can access the HDFS data in one of the following ways.
Configure the Kerberos configuration file in a command
Configure --conf
and --files
in the command, for example:
${SPARK_HOME}/bin/spark-submit --master xxx --num-executors 2 --executor-cores 2 --executor-memory 1g \\\n--conf \"spark.driver.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--conf \"spark.executor.extraJavaOptions=-Djava.security.krb5.conf=./krb5.conf\" \\\n--files /local/path/to/xxx.keytab,/local/path/to/krb5.conf \\\n--class com.vesoft.nebula.exchange.Exchange \\\nexchange.jar -c xx.conf\n
The file path in --conf
can be configured in two ways as follows:
./krb5.conf
). The resource files uploaded via --files
are located in the working directory of the Java virtual machine or JAR.The files in --files
must be stored on the machine where the spark-submit
command is executed.
Without commands
Deploy the Spark and Kerberos-certified Hadoop in a same cluster to make them share HDFS and YARN, and then add the configuration export HADOOP_HOME=<hadoop_home_path>
to spark-env.sh
in Spark.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/","title":"Import data from Pulsar","text":"This topic provides an example of how to use Exchange to import NebulaGraph data stored in Pulsar.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
.jar
file directly.tags.type.sink
and edges.type.sink
is client
.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#step_2_modify_configuration_files","title":"Step 2: Modify configuration files","text":"After Exchange is compiled, copy the conf file target/classes/application.conf
to set Pulsar data source configuration. In this example, the copied file is called pulsar_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n cores: {\n max: 16\n }\n }\n\n\n # NebulaGraph configuration\n nebula: {\n address:{\n # Specify the IP addresses and ports for Graph and all Meta services.\n # If there are multiple addresses, the format is \"ip1:port\",\"ip2:port\",\"ip3:port\".\n # Addresses are separated by commas.\n graph:[\"127.0.0.1:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"127.0.0.1:9559\"]\n }\n\n # The account entered must have write permission for the NebulaGraph space.\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n # Fill in the name of the graph space you want to write data to in the NebulaGraph.\n space: basketballplayer\n connection: {\n timeout: 3000\n retry: 3\n }\n execution: {\n retry: 3\n }\n error: {\n max: 32\n output: /tmp/errors\n }\n rate: {\n limit: 1024\n timeout: 1000\n }\n }\n # Processing vertices\n tags: [\n # Set the information about the Tag player.\n {\n # The corresponding Tag name in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to Pulsar.\n source: pulsar\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n # The address of the Pulsar server.\n service: \"pulsar://127.0.0.1:6650\"\n # admin.url of pulsar.\n admin: \"http://127.0.0.1:8081\"\n # The Pulsar option can be configured from topic, topics or topicsPattern.\n options: {\n topics: \"topic1,topic2\"\n }\n\n # Specify the column names in the player table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [age,name]\n nebula.fields: [age,name]\n\n # Specify a column of data in the table as the source of VIDs in the NebulaGraph.\n vertex:{\n field:playerid\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # Whether or not to delete the related incoming and outgoing edges of the vertices when performing a batch delete operation. This parameter takes effect when `writeMode` is `DELETE`.\n #deleteEdge: false\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n }\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: pulsar\n sink: client\n }\n service: \"pulsar://127.0.0.1:6650\"\n admin: \"http://127.0.0.1:8081\"\n options: {\n topics: \"topic1,topic2\"\n }\n fields: [name]\n nebula.fields: [name]\n vertex:{\n field:teamid\n }\n batch: 10\n partition: 10\n interval.seconds: 10\n }\n\n ]\n\n # Processing edges\n edges: [\n # Set the information about Edge Type follow\n {\n # The corresponding Edge Type name in NebulaGraph.\n name: follow\n\n type: {\n # Specify the data source file format to Pulsar.\n source: pulsar\n\n # Specify how to import the Edge type data into NebulaGraph.\n # Specify how to import the data into NebulaGraph. Only client is supported.\n sink: client\n }\n\n # The address of the Pulsar server.\n service: \"pulsar://127.0.0.1:6650\"\n # admin.url of pulsar.\n admin: \"http://127.0.0.1:8081\"\n # The Pulsar option can be configured from topic, topics or topicsPattern.\n options: {\n topics: \"topic1,topic2\"\n }\n\n # Specify the column names in the follow table in fields, and their corresponding values are specified as properties in the NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n # If multiple column names need to be specified, separate them by commas.\n fields: [degree]\n nebula.fields: [degree]\n\n # In source, use a column in the follow table as the source of the edge's source vertex.\n # In target, use a column in the follow table as the source of the edge's destination vertex.\n source:{\n field:src_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n target:{\n field:dst_player\n # udf:{\n # separator:\"_\"\n # oldColNames:[field-0,field-1,field-2]\n # newColName:new-field\n # }\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 10\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 10\n\n # The interval for message reading. Unit: second.\n interval.seconds: 10\n }\n\n # Set the information about the Edge Type serve\n {\n name: serve\n type: {\n source: Pulsar\n sink: client\n }\n service: \"pulsar://127.0.0.1:6650\"\n admin: \"http://127.0.0.1:8081\"\n options: {\n topics: \"topic1,topic2\"\n }\n\n fields: [start_year,end_year]\n nebula.fields: [start_year,end_year]\n source:{\n field:playerid\n }\n\n target:{\n field:teamid\n }\n\n # (Optional) Specify a column as the source of the rank.\n #ranking: rank\n\n batch: 10\n partition: 10\n interval.seconds: 10\n }\n ]\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-pulsar/#step_3_import_data_into_nebulagraph","title":"Step 3: Import data into NebulaGraph","text":"Run the following command to import Pulsar data into NebulaGraph. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <pulsar_application.conf_path>\n
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/pulsar_application.conf\n
You can search for batchSuccess.<tag_name/edge_name>
in the command output to check the number of successes. For example, batchSuccess.follow: 300
.
Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/","title":"Import data from SST files","text":"This topic provides an example of how to generate the data from the data source into an SST (Sorted String Table) file and save it on HDFS, and then import it into NebulaGraph. The sample data source is a CSV file.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#precautions","title":"Precautions","text":"Exchange supports two data import modes:
The following describes the scenarios, implementation methods, prerequisites, and steps for generating an SST file and importing data.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#scenarios","title":"Scenarios","text":"Suitable for online services, because the generation almost does not affect services (just reads the Schema), and the import speed is fast.
Caution
Although the import speed is fast, write operations in the corresponding space are blocked during the import period (about 10 seconds). Therefore, you are advised to import data in off-peak hours.
The underlying code in NebulaGraph uses RocksDB as the key-value storage engine. RocksDB is a storage engine based on the hard disk, providing a series of APIs for creating and importing SST files to help quickly import massive data.
The SST file is an internal file containing an arbitrarily long set of ordered key-value pairs for efficient storage of large amounts of key-value data. The entire process of generating SST files is mainly done by Exchange Reader, sstProcessor, and sstWriter. The whole data processing steps are as follows:
Reader reads data from the data source.
sstProcessor generates the SST file from the NebulaGraph's Schema information and uploads it to the HDFS. For details about the format of the SST file, see Data Storage Format.
sstWriter opens a file and inserts data. When generating SST files, keys must be written in sequence.
After the SST file is generated, RocksDB imports the SST file into NebulaGraph using the IngestExternalFile()
method. For example:
IngestExternalFileOptions ifo;\n# Import two SST files\nStatus s = db_->IngestExternalFile({\"/home/usr/file1.sst\", \"/home/usr/file2.sst\"}, ifo);\nif (!s.ok()) {\n printf(\"Error while adding file %s and %s, Error %s\\n\",\n file_path1.c_str(), file_path2.c_str(), s.ToString().c_str());\n return 1;\n}\n
When the IngestExternalFile()
method is called, RocksDB copies the file to the data directory by default and blocks the RocksDB write operation. If the key range in the SST file overwrites the Memtable key range, flush the Memtable to the hard disk. After placing the SST file in an optimal location in the LSM tree, assign a global serial number to the file and turn on the write operation.
This topic takes the basketballplayer dataset as an example.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#environment","title":"Environment","text":"This example is done on MacOS. Here is the environment configuration information:
Before importing data, you need to confirm the following information:
NebulaGraph has been installed and deployed with the following information:
--ws_storage_http_port
in the Meta service configuration file is the same as --ws_http_port
in the Storage service configuration file. For example, 19779
.--ws_meta_http_port
in the Graph service configuration file is the same as --ws_http_port
in the Meta service configuration file. For example, 19559
..jar
file directly.JAVA_HOME
has been configured.The Hadoop service has been installed and started.
Note
-- move_Files =true
to the Storage Service configuration file.Analyze the data to create a Schema in NebulaGraph by following these steps:
Identify the Schema elements. The Schema elements in the NebulaGraph are shown in the following table.
Element Name Property Tagplayer
name string, age int
Tag team
name string
Edge Type follow
degree int
Edge Type serve
start_year int, end_year int
Create a graph space basketballplayer in the NebulaGraph and create a Schema as shown below.
## Create a graph space\nnebula> CREATE SPACE basketballplayer \\\n (partition_num = 10, \\\n replica_factor = 1, \\\n vid_type = FIXED_STRING(30));\n\n## Use the graph space basketballplayer\nnebula> USE basketballplayer;\n\n## Create the Tag player\nnebula> CREATE TAG player(name string, age int);\n\n## Create the Tag team\nnebula> CREATE TAG team(name string);\n\n## Create the Edge type follow\nnebula> CREATE EDGE follow(degree int);\n\n## Create the Edge type serve\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
For more information, see Quick start workflow.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_2_process_csv_files","title":"Step 2: Process CSV files","text":"Confirm the following information:
Process CSV files to meet Schema requirements.
Note
Exchange supports uploading CSV files with or without headers.
Obtain the CSV file storage path.
After Exchange is compiled, copy the conf file target/classes/application.conf
to set SST data source configuration. In this example, the copied file is called sst_application.conf
. For details on each configuration item, see Parameters in the configuration file.
{\n # Spark configuration\n spark: {\n app: {\n name: NebulaGraph Exchange 3.7.0\n }\n\n master:local\n\n driver: {\n cores: 1\n maxResultSize: 1G\n }\n\n executor: {\n memory:1G\n }\n\n cores:{\n max: 16\n }\n }\n\n # NebulaGraph configuration\n nebula: {\n address:{\n graph:[\"192.8.168.XXX:9669\"]\n # the address of any of the meta services.\n # if your NebulaGraph server is in virtual network like k8s, please config the leader address of meta.\n meta:[\"192.8.168.XXX:9559\"]\n }\n user: root\n pswd: nebula\n # Whether to use a password encrypted with RSA.\n # enableRSA: true\n # The key used to encrypt the password using RSA.\n # privateKey: \"\"\n\n space: basketballplayer\n\n # SST file configuration\n path:{\n # The local directory that temporarily stores generated SST files\n local:\"/tmp\"\n\n # The path for storing the SST file in the HDFS\n remote:\"/sst\"\n\n # The NameNode address of HDFS, for example, \"hdfs://<ip/hostname>:<port>\"\n hdfs.namenode: \"hdfs://*.*.*.*:9000\"\n }\n\n # The connection parameters of clients\n connection: {\n # The timeout duration of socket connection and execution. Unit: milliseconds.\n timeout: 30000\n }\n\n error: {\n # The maximum number of failures that will exit the application.\n max: 32\n # Failed import jobs are logged in the output path.\n output: /tmp/errors\n }\n\n # Use Google's RateLimiter to limit requests to NebulaGraph.\n rate: {\n # Steady throughput of RateLimiter.\n limit: 1024\n\n # Get the allowed timeout duration from RateLimiter. Unit: milliseconds.\n timeout: 1000\n }\n }\n\n\n # Processing vertices\n tags: [\n # Set the information about the Tag player.\n {\n # Specify the Tag name defined in NebulaGraph.\n name: player\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: sst\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://<ip/hostname>:port/xx/xx.csv\".\n path: \"hdfs://*.*.*.*:9000/dataset/vertex_player.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has a header, use the actual column name.\n fields: [_c1, _c2]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [age, name]\n\n # Specify a column of data in the table as the source of VIDs in NebulaGraph.\n # The value of vertex must be consistent with the column name in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n vertex: {\n field:_c0\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n\n # Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file.\n repartitionWithNebula: false\n }\n\n # Set the information about the Tag Team.\n {\n name: team\n type: {\n source: csv\n sink: sst\n }\n path: \"hdfs://*.*.*.*:9000/dataset/vertex_team.csv\"\n fields: [_c1]\n nebula.fields: [name]\n vertex: {\n field:_c0\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n repartitionWithNebula: false\n }\n # If more vertices need to be added, refer to the previous configuration to add them.\n ]\n # Processing edges\n edges: [\n # Set the information about the Edge Type follow.\n {\n # The Edge Type name defined in NebulaGraph.\n name: follow\n type: {\n # Specify the data source file format to CSV.\n source: csv\n\n # Specify how to import the data into NebulaGraph: Client or SST.\n sink: sst\n }\n\n # Specify the path to the CSV file.\n # If the file is stored in HDFS, use double quotation marks to enclose the file path, starting with hdfs://. For example, \"hdfs://<ip/hostname>:port/xx/xx.csv\".\n path: \"hdfs://*.*.*.*:9000/dataset/edge_follow.csv\"\n\n # If the CSV file does not have a header, use [_c0, _c1, _c2, ..., _cn] to represent its header and indicate the columns as the source of the property values.\n # If the CSV file has a header, use the actual column name.\n fields: [_c2]\n\n # Specify the property name defined in NebulaGraph.\n # The sequence of fields and nebula.fields must correspond to each other.\n nebula.fields: [degree]\n\n # Specify a column as the source for the source and destination vertices.\n # The value of vertex must be consistent with the column name in the above fields or csv.fields.\n # Currently, NebulaGraph master supports only strings or integers of VID.\n source: {\n field: _c0\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n target: {\n field: _c1\n # Add the specified prefix to the VID. For example, if the VID is `12345`, adding the prefix `tag1` will result in `tag1_12345`. The underscore cannot be modified.\n # prefix:\"tag1\"\n # Performs hashing operations on VIDs of type string.\n # policy:hash\n }\n\n # The delimiter specified. The default value is comma.\n separator: \",\"\n\n # (Optional) Specify a column as the source of the rank.\n\n #ranking: rank\n\n # If the CSV file has a header, set the header to true.\n # If the CSV file does not have a header, set the header to false. The default value is false.\n header: false\n\n # The filtering rule. The data that matches the filter rule is imported into NebulaGraph.\n # filter: \"name='Tom'\"\n\n # Batch operation types, including INSERT, UPDATE, and DELETE. defaults to INSERT.\n #writeMode: INSERT\n\n # The number of data written to NebulaGraph in a single batch.\n batch: 256\n\n # The number of partitions to be created when the data is written to NebulaGraph.\n partition: 32\n\n # Whether to repartition data based on the number of partitions of graph spaces in NebulaGraph when generating the SST file.\n repartitionWithNebula: false\n }\n\n # Set the information about the Edge Type serve.\n {\n name: serve\n type: {\n source: csv\n sink: sst\n }\n path: \"hdfs://*.*.*.*:9000/dataset/edge_serve.csv\"\n fields: [_c2,_c3]\n nebula.fields: [start_year, end_year]\n source: {\n field: _c0\n }\n target: {\n field: _c1\n }\n separator: \",\"\n header: false\n batch: 256\n partition: 32\n repartitionWithNebula: false\n }\n\n ]\n # If more edges need to be added, refer to the previous configuration to add them.\n}\n
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_4_generate_the_sst_file","title":"Step 4: Generate the SST file","text":"Run the following command to generate the SST file from the CSV source file. For a description of the parameters, see Options for import.
${SPARK_HOME}/bin/spark-submit --master \"local\" --conf spark.sql.shuffle.partition=<shuffle_concurrency> --class com.vesoft.nebula.exchange.Exchange <nebula-exchange.jar_path> -c <sst_application.conf_path> \n
Note
When generating SST files, the shuffle operation of Spark will be involved. Note that the configuration of spark.sql.shuffle.partition
should be added when you submit the command.
Note
JAR packages are available in two ways: compiled them yourself, or download the compiled .jar
file directly.
For example:
${SPARK_HOME}/bin/spark-submit --master \"local\" --conf spark.sql.shuffle.partition=200 --class com.vesoft.nebula.exchange.Exchange /root/nebula-exchange/nebula-exchange/target/nebula-exchange_spark_2.4-3.7.0.jar -c /root/nebula-exchange/nebula-exchange/target/classes/sst_application.conf\n
After the task is complete, you can view the generated SST file in the /sst
directory (specified by the nebula.path.remote
parameter) on HDFS.
Note
If you modify the Schema, such as rebuilding the graph space, modifying the Tag, or modifying the Edge type, you need to regenerate the SST file because the SST file verifies the space ID, Tag ID, and Edge ID.
"},{"location":"import-export/nebula-exchange/use-exchange/ex-ug-import-from-sst/#step_5_import_the_sst_file","title":"Step 5: Import the SST file","text":"Note
Confirm the following information before importing:
HADOOP_HOME
and JAVA_HOME
.--ws_storage_http_port
in the Meta service configuration file (add it manually if it does not exist) is the same as the --ws_http_port
in the Storage service configuration file. For example, both are 19779
.--ws_meta_http_port
in the Graph service configuration file (add it manually if it does not exist) is the same as the --ws_http_port
in the Meta service configuration file. For example, both are 19559
.Connect to the NebulaGraph database using the client tool and import the SST file as follows:
Run the following command to select the graph space you created earlier.
nebula> USE basketballplayer;\n
Run the following command to download the SST file:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://<hadoop_address>:<hadoop_port>/<sst_file_path>\";\n
For example:
nebula> SUBMIT JOB DOWNLOAD HDFS \"hdfs://*.*.*.*:9000/sst\";\n
Run the following command to import the SST file:
nebula> SUBMIT JOB INGEST;\n
Note
download
folder in the space ID in the data/storage/nebula
directory in the NebulaGraph installation path, and then download the SST file again. If the space has multiple copies, the download
folder needs to be deleted on all machines where the copies are saved.SUBMIT JOB INGEST;
.Users can verify that data has been imported by executing a query in the NebulaGraph client (for example, NebulaGraph Studio). For example:
LOOKUP ON player YIELD id(vertex);\n
Users can also run the SHOW STATS
command to view statistics.
With the data imported, users can recreate and rebuild indexes in NebulaGraph. For details, see Index overview.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/","title":"What is NebulaGraph Operator","text":""},{"location":"k8s-operator/1.introduction-to-nebula-operator/#concept","title":"Concept","text":"NebulaGraph Operator is a tool to automate the deployment, operation, and maintenance of NebulaGraph clusters on Kubernetes. Building upon the excellent scalability mechanism of Kubernetes, NebulaGraph introduced its operation and maintenance knowledge into the Kubernetes system, which makes NebulaGraph a real cloud-native graph database.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/#how_it_works","title":"How it works","text":"For resource types that do not exist within Kubernetes, you can register them by adding custom API objects. The common way is to use the CustomResourceDefinition.
NebulaGraph Operator abstracts the deployment management of NebulaGraph clusters as a CRD. By combining multiple built-in API objects including StatefulSet, Service, and ConfigMap, the routine management and maintenance of a NebulaGraph cluster are coded as a control loop in the Kubernetes system. When a CR instance is submitted, NebulaGraph Operator drives database clusters to the final state according to the control process.
"},{"location":"k8s-operator/1.introduction-to-nebula-operator/#features","title":"Features","text":"The following features are already available in NebulaGraph Operator:
NebulaGraph Operator does not support the v1.x version of NebulaGraph. NebulaGraph Operator version and the corresponding NebulaGraph version are as follows:
NebulaGraph NebulaGraph Operator 3.5.x ~ 3.6.0 1.5.0 ~ 1.7.x 3.0.0 ~ 3.4.1 1.3.0, 1.4.0 ~ 1.4.2 3.0.0 ~ 3.3.x 1.0.0, 1.1.0, 1.2.0 2.5.x ~ 2.6.x 0.9.0 2.5.x 0.8.0Legacy version compatibility
Release
"},{"location":"k8s-operator/5.FAQ/","title":"FAQ","text":""},{"location":"k8s-operator/5.FAQ/#does_nebulagraph_operator_support_the_v1x_version_of_nebulagraph","title":"Does NebulaGraph Operator support the v1.x version of NebulaGraph?","text":"No, because the v1.x version of NebulaGraph does not support DNS, and NebulaGraph Operator requires the use of DNS.
"},{"location":"k8s-operator/5.FAQ/#is_cluster_stability_guaranteed_if_using_local_storage","title":"Is cluster stability guaranteed if using local storage?","text":"There is no guarantee. Using local storage means that the Pod is bound to a specific node, and NebulaGraph Operator does not currently support failover in the event of a failure of the bound node.
"},{"location":"k8s-operator/5.FAQ/#how_to_ensure_the_stability_of_a_cluster_when_scaling_the_cluster","title":"How to ensure the stability of a cluster when scaling the cluster?","text":"It is suggested to back up data in advance so that you can roll back data in case of failure.
"},{"location":"k8s-operator/5.FAQ/#is_the_replica_in_the_operator_docs_the_same_as_the_replica_in_the_nebulagraph_core_docs","title":"Is the replica in the Operator docs the same as the replica in the NebulaGraph core docs?","text":"They are different concepts. A replica in the Operator docs indicates a pod replica in K8s, while a replica in the core docs is a replica of a NebulaGraph storage partition.
"},{"location":"k8s-operator/5.FAQ/#how_to_view_the_logs_of_each_service_in_the_nebulagraph_cluster","title":"How to view the logs of each service in the NebulaGraph cluster?","text":"To obtain the logs of each cluster service, you need to access the container and view the log files that are stored inside.
Steps to view the logs of each service in the NebulaGraph cluster:
# To view the name of the pod where the container you want to access is located. \n# Replace <cluster-name> with the name of the cluster.\nkubectl get pods -l app.kubernetes.io/cluster=<cluster-name>\n\n# To access the container within the pod, such as the nebula-graphd-0 container.\nkubectl exec -it nebula-graphd-0 -- /bin/bash\n\n# To go to /usr/local/nebula/logs directory to view the logs.\ncd /usr/local/nebula/logs\n
"},{"location":"k8s-operator/5.FAQ/#how_to_resolve_the_host_not_foundnebula-metadstoragedgraphd-0nebulametadstoragedgraphd-headlessdefaultsvcclusterlocal_error","title":"How to resolve the host not found:nebula-<metad|storaged|graphd>-0.nebula.<metad|storaged|graphd>-headless.default.svc.cluster.local
error?","text":"This error is generally caused by a DNS resolution failure, and you need to check whether the cluster domain has been modified. If the cluster domain has been modified, you need to modify the kubernetesClusterDomain
field in the NebulaGraph Operator configuration file accordingly. The steps for modifying the Operator configuration file are as follows:
View the Operator configuration file.
[abby@master ~]$ helm show values nebula-operator/nebula-operator \nimage:\n nebulaOperator:\n image: vesoft/nebula-operator:v1.8.0\n imagePullPolicy: Always\n kubeRBACProxy:\n image: bitnami/kube-rbac-proxy:0.14.2\n imagePullPolicy: Always\n kubeScheduler:\n image: registry.k8s.io/kube-scheduler:v1.24.11\n imagePullPolicy: Always\n\nimagePullSecrets: []\nkubernetesClusterDomain: \"\" # The cluster domain name, and the default is cluster.local.\n
Modify the value of the kubernetesClusterDomain
field to the updated cluster domain name.
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=<nebula-operator-system> --version=1.8.0 --set kubernetesClusterDomain=<cluster-domain>\n
is the namespace where Operator is located and is the updated domain name."},{"location":"k8s-operator/2.get-started/2.1.install-operator/","title":"Install NebulaGraph Operator","text":"You can deploy NebulaGraph Operator with Helm.
"},{"location":"k8s-operator/2.get-started/2.1.install-operator/#background","title":"Background","text":"NebulaGraph Operator automates the management of NebulaGraph clusters, and eliminates the need for you to install, scale, upgrade, and uninstall NebulaGraph clusters, which lightens the burden on managing different application versions.
"},{"location":"k8s-operator/2.get-started/2.1.install-operator/#prerequisites","title":"Prerequisites","text":"Before installing NebulaGraph Operator, you need to install the following software and ensure the correct version of the software :
Software Requirement Kubernetes >= 1.18 Helm >= 3.2.0 CoreDNS >= 1.6.0Note
Add the NebulaGraph Operator Helm repository.
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update information of available charts locally from repositories.
helm repo update\n
For more information about helm repo
, see Helm Repo.
Create a namespace for NebulaGraph Operator.
kubectl create namespace <namespace_name>\n
For example, run the following command to create a namespace named nebula-operator-system
.
kubectl create namespace nebula-operator-system\n
All the resources of NebulaGraph Operator are deployed in this namespace.
Install NebulaGraph Operator.
helm install nebula-operator nebula-operator/nebula-operator --namespace=<namespace_name> --version=${chart_version}\n
For example, the command to install NebulaGraph Operator of version 1.8.0 is as follows.
helm install nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0\n
1.8.0
is the version of the nebula-operator chart. When not specifying --version
, the latest version of the nebula-operator chart is used by default.
Run helm search repo -l nebula-operator
to see chart versions.
You can customize the configuration items of the NebulaGraph Operator chart before running the installation command. For more information, see Customize installation defaults.
View the information about the default-created CRD.
kubectl get crd\n
Output:
NAME CREATED AT\nnebulaautoscalers.autoscaling.nebula-graph.io 2023-11-01T04:16:51Z\nnebulaclusters.apps.nebula-graph.io 2023-10-12T07:55:32Z\nnebularestores.apps.nebula-graph.io 2023-02-04T23:01:00Z\n
Create a NebulaGraph cluster
"},{"location":"k8s-operator/2.get-started/2.3.create-cluster/","title":"Create a NebulaGraph cluster","text":"This topic introduces how to create a NebulaGraph cluster with the following two methods:
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
Add the NebulaGraph Operator Helm repository.
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update information of available charts locally from chart repositories.
helm repo update\n
Set environment variables to your desired values.
export NEBULA_CLUSTER_NAME=nebula # The desired NebulaGraph cluster name.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # The desired namespace where your NebulaGraph cluster locates.\nexport STORAGE_CLASS_NAME=fast-disks # The name of the StorageClass that has been created.\n
Create a namespace for your NebulaGraph cluster (If you have created one, skip this step).
kubectl create namespace \"${NEBULA_CLUSTER_NAMESPACE}\"\n
Apply the variables to the Helm chart to create a NebulaGraph cluster.
helm install \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\\n --set nameOverride=\"${NEBULA_CLUSTER_NAME}\" \\\n --set nebula.storageClassName=\"${STORAGE_CLASS_NAME}\" \\\n # Specify the version of the NebulaGraph cluster. \n --set nebula.version=vmaster \\ \n # Specify the version of the nebula-cluster chart. If not specified, the latest version of the chart is installed by default.\n # Run 'helm search repo nebula-operator/nebula-cluster' to view the available versions of the chart. \n --version=1.8.0 \\\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
The following example shows how to create a NebulaGraph cluster by creating a cluster named nebula
.
Create a namespace, for example, nebula
. If not specified, the default
namespace is used.
kubectl create namespace nebula\n
Define the cluster configuration file nebulacluster.yaml
.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n graphd:\n # Container image for the Graph service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n # Storage class name for storing Graph service logs.\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n imagePullPolicy: Always\n metad:\n # Container image for the Meta service.\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n storaged:\n # Container image for the Storage service.\n image: vesoft/nebula-storaged\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n
For more information about the other parameters, see Install NebulaGraph clusters.
Create a NebulaGraph cluster.
kubectl create -f nebulacluster.yaml\n
Output:
nebulacluster.apps.nebula-graph.io/nebula created\n
Check the status of the NebulaGraph cluster.
kubectl get nc nebula\n
Output:
NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\nnebula True 1 1 1 1 1 1 86s\n
Connect to a cluster
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/","title":"Connect to a NebulaGraph cluster","text":"After creating a NebulaGraph cluster with NebulaGraph Operator on Kubernetes, you can connect to NebulaGraph databases from within the cluster and outside the cluster.
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created on Kubernetes. For more information, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/2.get-started/2.4.connect-to-cluster/#connect_to_nebulagraph_databases_from_within_a_nebulagraph_cluster","title":"Connect to NebulaGraph databases from within a NebulaGraph cluster","text":"You can create a ClusterIP
type Service to provide an access point to the NebulaGraph database for other Pods within the cluster. By using the Service's IP and the Graph service's port number (9669), you can connect to the NebulaGraph database. For more information, see ClusterIP.
Create a file named graphd-clusterip-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc\n namespace: default\nspec:\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: ClusterIP # Set the type to ClusterIP.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Create a ClusterIP Service.
kubectl create -f graphd-clusterip-service.yaml \n
Check the IP of the Service:
$ kubectl get service -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc ClusterIP 10.98.213.34 <none> 9669/TCP,19669/TCP,19670/TCP 23h\n...\n
Run the following command to connect to the NebulaGraph database using the IP of the <cluster-name>-graphd-svc
Service above:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_ip> -port <service_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 10.98.213.34 -port 9669 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name.-addr
: The IP of the ClusterIP
Service, used to connect to Graphd services.-port
: The port to connect to Graphd services, the default port of which is 9669
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
You can also connect to NebulaGraph databases with Fully Qualified Domain Name (FQDN). The domain format is <cluster-name>-graphd.<cluster-namespace>.svc.<CLUSTER_DOMAIN>
. The default value of CLUSTER_DOMAIN
is cluster.local
.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_name>-graphd-svc.default.svc.cluster.local -port <service_port> -u <username> -p <password>\n
service_port
is the port to connect to Graphd services, the default port of which is 9669
.
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr nebula-graphd-svc.default.svc.cluster.local -port 9669 -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
You can also create a ClusterIP
type Service to provide an access point to the NebulaGraph database for other Pods within the cluster. By using the Service's IP and the Graph service's port number (9669), you can connect to the NebulaGraph database. For more information, see ClusterIP.
Create a file named graphd-clusterip-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc\n namespace: default\nspec:\n externalTrafficPolicy: Local\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: ClusterIP # Set the type to ClusterIP.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Create a ClusterIP Service.
kubectl create -f graphd-clusterip-service.yaml \n
Check the IP of the Service:
$ kubectl get service -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\nNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc ClusterIP 10.98.213.34 <none> 9669/TCP,19669/TCP,19670/TCP 23h\n...\n
Run the following command to connect to the NebulaGraph database using the IP of the <cluster-name>-graphd-svc
Service above:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_ip> -port <service_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 10.98.213.34 -port 9669 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name.-addr
: The IP of the ClusterIP
Service, used to connect to Graphd services.-port
: The port to connect to Graphd services, the default port of which is 9669
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
You can also connect to NebulaGraph databases with Fully Qualified Domain Name (FQDN). The domain format is <cluster-name>-graphd.<cluster-namespace>.svc.<CLUSTER_DOMAIN>
. The default value of CLUSTER_DOMAIN
is cluster.local
.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <cluster_name>-graphd-svc.default.svc.cluster.local -port <service_port> -u <username> -p <password>\n
service_port
is the port to connect to Graphd services, the default port of which is 9669
.
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr nebula-graphd-svc.default.svc.cluster.local -port 9669 -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
NodePort
","text":"You can create a NodePort
type Service to access internal cluster services from outside the cluster using any node IP and the exposed node port. You can also utilize load balancing services provided by cloud vendors (such as Azure, AWS, etc.) by setting the Service type to LoadBalancer
. This allows external access to internal cluster services through the public IP and port of the load balancer provided by the cloud vendor.
The Service of type NodePort
forwards the front-end requests via the label selector spec.selector
to Graphd pods with labels app.kubernetes.io/cluster: <cluster-name>
and app.kubernetes.io/component: graphd
.
After creating a NebulaGraph cluster based on the example template, where spec.graphd.service.type=NodePort
, the NebulaGraph Operator will automatically create a NodePort type Service named <cluster-name>-graphd-svc
in the same namespace. You can directly connect to the NebulaGraph database through any node IP and the exposed node port (see step 4 below). You can also create a custom Service according to your needs.
Steps:
Create a YAML file named graphd-nodeport-service.yaml
. The file contents are as follows:
apiVersion: v1\nkind: Service\nmetadata:\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: nebula-graphd-svc-nodeport\n namespace: default\nspec:\n externalTrafficPolicy: Local\n ports:\n - name: thrift\n port: 9669\n protocol: TCP\n targetPort: 9669\n - name: http\n port: 19669\n protocol: TCP\n targetPort: 19669\n selector:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: graphd\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n type: NodePort # Set the type to NodePort.\n
9669
by default. 19669
is the HTTP port of the Graph service in a NebulaGraph cluster.targetPort
is the port mapped to the database Pods, which can be customized.Run the following command to create a NodePort Service.
kubectl create -f graphd-nodeport-service.yaml\n
Check the port mapped on all of your cluster nodes.
kubectl get services -l app.kubernetes.io/cluster=<nebula> # <nebula> is the name of your NebulaGraph cluster.\n
Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-svc-nodeport NodePort 10.107.153.129 <none> 9669:32236/TCP,19669:31674/TCP,19670:31057/TCP 24h\n...\n
As you see, the mapped port of NebulaGraph databases on all cluster nodes is 32236
.
Connect to NebulaGraph databases with your node IP and the node port above.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <node_ip> -port <node_port> -u <username> -p <password>\n
For example:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 192.168.8.24 -port 32236 -u root -p vesoft\nIf you don't see a command prompt, try pressing enter.\n\n(root@nebula) [(none)]>\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
: The custom Pod name. The above example uses nebula-console
.-addr
: The IP of any node in a NebulaGraph cluster. The above example uses 192.168.8.24
.-port
: The mapped port of NebulaGraph databases on all cluster nodes. The above example uses 32236
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr <node_ip> -port <node_port> -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
When dealing with multiple pods in a cluster, managing services for each pod separately is not a good practice. Ingress is a Kubernetes resource that provides a unified entry point for accessing multiple services. Ingress can be used to expose multiple services under a single IP address.
Nginx Ingress is an implementation of Kubernetes Ingress. Nginx Ingress watches the Ingress resource of a Kubernetes cluster and generates the Ingress rules into Nginx configurations that enable Nginx to forward 7 layers of traffic.
You can use Nginx Ingress to connect to a NebulaGraph cluster from outside the cluster using a combination of the host network and DaemonSet pattern.
Due to the use of HostNetwork
, Nginx Ingress pods may be scheduled on the same node (port conflicts will occur when multiple pods try to listen on the same port on the same node). To avoid this situation, Nginx Ingress is deployed on these nodes in DaemonSet mode (ensuring that a pod replica runs on each node in the cluster). You first need to select some nodes and label them for the specific deployment of Nginx Ingress.
Ingress does not support TCP or UDP services. For this reason, the nginx-ingress-controller pod uses the flags --tcp-services-configmap
and --udp-services-configmap
to point to an existing ConfigMap where the key refers to the external port to be used and the value refers to the format of the service to be exposed. The format of the value is <namespace/service_name>:<service_port>
.
For example, the configurations of the ConfigMap named as tcp-services
is as follows:
apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: tcp-services\n namespace: nginx-ingress\ndata:\n # update \n 9769: \"default/nebula-graphd-svc:9669\"\n
Steps are as follows.
Create a file named nginx-ingress-daemonset-hostnetwork.yaml
.
Click on nginx-ingress-daemonset-hostnetwork.yaml to view the complete content of the example YAML file.
Note
The resource objects in the YAML file above use the namespace nginx-ingress
. You can run kubectl create namespace nginx-ingress
to create this namespace, or you can customize the namespace.
Label a node where the DaemonSet named nginx-ingress-controller
in the above YAML file (The node used in this example is named worker2
with an IP of 192.168.8.160
) runs.
kubectl label node worker2 nginx-ingress=true\n
Run the following command to enable Nginx Ingress in the cluster you created.
kubectl create -f nginx-ingress-daemonset-hostnetwork.yaml\n
Output:
configmap/nginx-ingress-controller created\nconfigmap/tcp-services created\nserviceaccount/nginx-ingress created\nserviceaccount/nginx-ingress-backend created\nclusterrole.rbac.authorization.k8s.io/nginx-ingress created\nclusterrolebinding.rbac.authorization.k8s.io/nginx-ingress created\nrole.rbac.authorization.k8s.io/nginx-ingress created\nrolebinding.rbac.authorization.k8s.io/nginx-ingress created\nservice/nginx-ingress-controller-metrics created\nservice/nginx-ingress-default-backend created\nservice/nginx-ingress-proxy-tcp created\ndaemonset.apps/nginx-ingress-controller created\n
Since the network type that is configured in Nginx Ingress is hostNetwork
, after successfully deploying Nginx Ingress, with the IP (192.168.8.160
) of the node where Nginx Ingress is deployed and with the external port (9769
) you define, you can access NebulaGraph.
Use the IP address and the port configured in the preceding steps. You can connect to NebulaGraph with NebulaGraph Console.
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- <nebula_console_name> -addr <host_ip> -port <external_port> -u <username> -p <password>\n
Output:
kubectl run -ti --image vesoft/nebula-console:v3.6.0 --restart=Never -- nebula-console -addr 192.168.8.160 -port 9769 -u root -p vesoft\n
--image
: The image for the tool NebulaGraph Console used to connect to NebulaGraph databases.<nebula-console>
The custom Pod name. The above example uses nebula-console
.-addr
: The IP of the node where Nginx Ingress is deployed. The above example uses 192.168.8.160
.-port
: The port used for external network access. The above example uses 9769
.-u
: The username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root.-p
: The password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password.A successful connection to the database is indicated if the following is returned:
If you don't see a command prompt, try pressing enter.\n(root@nebula) [(none)]>\n
Note
If the spec.console
field is set in the cluster configuration file, you can also connect to NebulaGraph databases with the following command:
# Enter the nebula-console Pod.\nkubectl exec -it nebula-console -- /bin/sh\n\n# Connect to NebulaGraph databases.\nnebula-console -addr <ingress_host_ip> -port <external_port> -u <username> -p <password>\n
For information about the nebula-console container, see nebula-console.
This topic introduces how to customize the default configurations when installing NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.1.customize-installation/#customizable_parameters","title":"Customizable parameters","text":"When executing the helm install [NAME] [CHART] [flags]
command to install a chart, you can specify the chart configuration. For more information, see Customizing the Chart Before Installing.
You can view the configurable options in the nebula-operator chart configuration file. Alternatively, you can view the configurable options through the command helm show values nebula-operator/nebula-operator
, as shown below.
[root@master ~]$ helm show values nebula-operator/nebula-operator \nimage:\n nebulaOperator:\n image: vesoft/nebula-operator:v1.8.0\n imagePullPolicy: Always\n\nimagePullSecrets: [ ]\nkubernetesClusterDomain: \"\"\n\ncontrollerManager:\n create: true\n replicas: 2\n env: [ ]\n resources:\n limits:\n cpu: 200m\n memory: 200Mi\n requests:\n cpu: 100m\n memory: 100Mi\n verbosity: 0\n ## Additional InitContainers to initialize the pod\n # Example:\n # extraInitContainers:\n # - name: init-auth-sidecar\n # command:\n # - /bin/sh\n # - -c\n # args:\n # - cp -R /certs/* /credentials/\n # imagePullPolicy: Always\n # image: reg.vesoft-inc.com/nebula-certs:latest\n # volumeMounts:\n # - name: credentials\n # mountPath: /credentials\n extraInitContainers: []\n\n # sidecarContainers - add more containers to controller-manager\n # Key/Value where Key is the sidecar `- name: <Key>`\n # Example:\n # sidecarContainers:\n # webserver:\n # image: nginx\n # OR for adding netshoot to controller manager\n # sidecarContainers:\n # netshoot:\n # args:\n # - -c\n # - while true; do ping localhost; sleep 60;done\n # command:\n # - /bin/bash\n # image: nicolaka/netshoot\n # imagePullPolicy: Always\n # name: netshoot\n # resources: {}\n sidecarContainers: {}\n\n ## Additional controller-manager Volumes\n extraVolumes: []\n\n ## Additional controller-manager Volume mounts\n extraVolumeMounts: []\n\n securityContext: {}\n # runAsNonRoot: true\n\nadmissionWebhook:\n create: false\n # The TCP port the Webhook server binds to. (default 9443)\n webhookBindPort: 9443\n\nscheduler:\n create: true\n schedulerName: nebula-scheduler\n replicas: 2\n env: [ ]\n resources:\n limits:\n cpu: 200m\n memory: 200Mi\n requests:\n cpu: 100m\n memory: 100Mi\n verbosity: 0\n plugins:\n enabled: [\"NodeZone\"]\n disabled: [] # Only in-tree plugins need to be defined here\n...\n
Part of the above parameters are described as follows:
Parameter Default value Descriptionimage.nebulaOperator.image
vesoft/nebula-operator:v1.8.0
The image of NebulaGraph Operator, version of which is 1.8.0. image.nebulaOperator.imagePullPolicy
IfNotPresent
The image pull policy in Kubernetes. imagePullSecrets
- The image pull secret in Kubernetes. For example imagePullSecrets[0].name=\"vesoft\"
. kubernetesClusterDomain
cluster.local
The cluster domain. controllerManager.create
true
Whether to enable the controller-manager component. controllerManager.replicas
2
The number of controller-manager replicas. controllerManager.env
[]
The environment variables for the controller-manager component. controllerManager.extraInitContainers
[]
Runs an init container. controllerManager.sidecarContainers
{}
Runs a sidecar container. controllerManager.extraVolumes
[]
Sets a storage volume. controllerManager.extraVolumeMounts
[]
Sets the storage volume mount path. controllerManager.securityContext
{}
Configures the access and control settings for NebulaGraph Operator. admissionWebhook.create
false
Whether to enable Admission Webhook. This option is disabled. To enable it, set the value to true
and you will need to install cert-manager. For details, see Enable admission control. admissionWebhook.webhookBindPort
9443
The TCP port the Webhook server binds to. It is 9443 by default. shceduler.create
true
Whether to enable Scheduler. shceduler.schedulerName
nebula-scheduler
The name of the scheduler customized by NebulaGraph Operator. shceduler.replicas
2
The number of nebula-scheduler replicas."},{"location":"k8s-operator/3.operator-management/3.1.customize-installation/#example","title":"Example","text":"The following example shows how to enable AdmissionWebhook when you install NebulaGraph Operator (AdmissionWebhook is disabled by default):
helm install nebula-operator nebula-operator/nebula-operator --namespace=<nebula-operator-system> --set admissionWebhook.create=true\n
Check whether the specified configuration of NebulaGraph Operator is installed successfully:
helm get values nebula-operator -n <nebula-operator-system>\n
Example output:
USER-SUPPLIED VALUES:\nadmissionWebhook:\n create: true\n
For more information about helm install
, see Helm Install.
This topic introduces how to update the configuration of NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.2.update-operator/#steps","title":"Steps","text":"Update the information of available charts locally from chart repositories.
helm repo update\n
View the default values of NebulaGraph Operator.
helm show values nebula-operator/nebula-operator\n
Update NebulaGraph Operator by passing configuration parameters via --set
.
--set
\uff1aOverrides values using the command line. For more configurable items, see Customize installation defaults.For example, to enable the AdmissionWebhook, run the following command:
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0 --set admissionWebhook.create=true\n
For more information, see Helm upgrade.
Check whether the configuration of NebulaGraph Operator is updated successfully.
helm get values nebula-operator -n nebula-operator-system\n
Example output:
USER-SUPPLIED VALUES:\nadmissionWebhook:\n create: true\n
Legacy version compatibility
View the current version of NebulaGraph Operator.
helm list --all-namespaces\n
Example output:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION\nnebula-operator nebula-operator-system 3 2023-11-06 12:06:24.742397418 +0800 CST deployed nebula-operator-1.7.0 1.7.0\n
Update the information of available charts locally from chart repositories.
helm repo update\n
View the latest version of NebulaGraph Operator.
helm search repo nebula-operator/nebula-operator\n
Example output:
NAME CHART VERSION APP VERSION DESCRIPTION\nnebula-operator/nebula-operator 1.8.0 1.8.0 Nebula Operator Helm chart for Kubernetes\n
Upgrade NebulaGraph Operator to version 1.8.0.
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=<namespace_name> --version=1.8.0\n
For example:
helm upgrade nebula-operator nebula-operator/nebula-operator --namespace=nebula-operator-system --version=1.8.0\n
Output:
Release \"nebula-operator\" has been upgraded. Happy Helming!\nNAME: nebula-operator\nLAST DEPLOYED: Tue Apr 16 02:21:08 2022\nNAMESPACE: nebula-operator-system\nSTATUS: deployed\nREVISION: 3\nTEST SUITE: None\nNOTES:\nNebulaGraph Operator installed!\n
Pull the latest CRD configuration file.
Note
You need to upgrade the corresponding CRD configurations after NebulaGraph Operator is upgraded. Otherwise, the creation of NebulaGraph clusters will fail. For information about the CRD configurations, see apps.nebula-graph.io_nebulaclusters.yaml.
Pull the NebulaGraph Operator chart package.
helm pull nebula-operator/nebula-operator --version=1.8.0\n
--version
: The NebulaGraph Operator version you want to upgrade to. If not specified, the latest version will be pulled.Run tar -zxvf
to unpack the charts.
For example: To unpack v1.8.0 chart to the /tmp
path, run the following command:
tar -zxvf nebula-operator-1.8.0.tgz -C /tmp\n
-C /tmp
: If not specified, the chart files will be unpacked to the current directory.Apply the latest CRD configuration file in the nebula-operator
directory.
kubectl apply -f crds/nebulaclusters.yaml\n
Output:
customresourcedefinition.apiextensions.k8s.io/nebulaclusters.apps.nebula-graph.io configured\n
This topic introduces how to uninstall NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.4.unistall-operator/#steps","title":"Steps","text":"Uninstall the NebulaGraph Operator chart.
helm uninstall nebula-operator --namespace=<nebula-operator-system>\n
View the information about the default-created CRD.
kubectl get crd\n
Output:
NAME CREATED AT\nnebulaautoscalers.autoscaling.nebula-graph.io 2023-11-01T04:16:51Z\nnebulaclusters.apps.nebula-graph.io 2023-10-12T07:55:32Z\nnebularestores.apps.nebula-graph.io 2023-02-04T23:01:00Z\n
Delete CRD.
kubectl delete crd nebulaclusters.apps.nebula-graph.io nebularestores.apps.nebula-graph.io nebulaautoscalers.autoscaling.nebula-graph.io\n
NebulaGraph Operator supports the management of multiple NebulaGraph clusters. By default, NebulaGraph Operator manages all NebulaGraph clusters. However, you can specify the clusters managed by NebulaGraph Operator. This topic describes how to specify the clusters managed by NebulaGraph Operator.
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#application_scenarios","title":"Application scenarios","text":"NebulaGraph Operator supports specifying the clusters managed by controller-manager through startup parameters. The supported parameters are as follows:
watchNamespaces
: Specifies the namespace where the NebulaGraph cluster is located. To specify multiple namespaces, separate them with commas (,
). For example, watchNamespaces=default,nebula
. If this parameter is not specified, NebulaGraph Operator manages all NebulaGraph clusters in all namespaces.nebulaObjectSelector
: Allows you to set specific labels and values to select the NebulaGraph clusters to be managed. It supports three label operation symbols: =
, ==
, and !=
. Both =
and ==
mean that the label's value is equal to the specified value, while !=
means the tag's value is not equal to the specified value. Multiple labels are separated by commas (,
), and the comma needs to be escaped with \\\\
. For example, nebulaObjectSelector=key1=value1\\\\,key2=value2
, which selects only the NebulaGraph clusters with labels key1=value1
and key2=value2
. If this parameter is not specified, NebulaGraph Operator manages all NebulaGraph clusters.Run the following command to make NebulaGraph Operator manage only the NebulaGraph clusters in the default
and nebula
namespaces. Ensure that the current Helm Chart version supports this parameter. For more information, see Update the configuration.
helm upgrade nebula-operator nebula-operator/nebula-operator --set watchNamespaces=default,nebula\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#specify_the_managed_clusters_by_label","title":"Specify the managed clusters by label","text":"Run the following command to make NebulaGraph Operator manage only the NebulaGraph clusters with the labels key1=value1
and key2=value2
. Ensure that the current Helm Chart version supports this parameter. For more information, see Update the configuration.
helm upgrade nebula-operator nebula-operator/nebula-operator --set nebulaObjectSelector=key1=value1\\\\,key2=value2\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#faq","title":"FAQ","text":""},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_set_labels_for_nebulagraph_clusters","title":"How to set labels for NebulaGraph clusters?","text":"Run the following command to set a label for the NebulaGraph cluster:
kubectl label nc <cluster_name> -n <namespace> <key>=<value>\n
For example, set the label env=test
for the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl label nc nebula -n nebulaspace env=test\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_view_the_labels_of_nebulagraph_clusters","title":"How to view the labels of NebulaGraph clusters?","text":"Run the following command to view the labels of NebulaGraph clusters:
kubectl get nc <cluster_name> -n <namespace> --show-labels\n
For example, view the labels of the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl get nc nebula -n nebulaspace --show-labels\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_delete_the_labels_of_nebulagraph_clusters","title":"How to delete the labels of NebulaGraph clusters?","text":"Run the following command to delete the label of NebulaGraph clusters:
kubectl label nc <cluster_name> -n <namespace> <key>-\n
For example, delete the label env=test
of the NebulaGraph cluster named nebula
in the nebulaspace
namespace:
kubectl label nc nebula -n nebulaspace env-\n
"},{"location":"k8s-operator/3.operator-management/3.5.cluster-scope-config/#how_to_view_the_namespace_where_the_nebulagraph_cluster_is_located","title":"How to view the namespace where the NebulaGraph cluster is located?","text":"Run the following command to list all namespaces where the NebulaGraph clusters are located:
kubectl get nc --all-namespaces\n
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/","title":"Customize the configuration of the NebulaGraph cluster","text":"The Meta, Storage, and Graph services each have their default configurations within the NebulaGraph cluster. NebulaGraph Operator allows for the customization of these cluster service configurations. This topic describes how to update the settings of the NebulaGraph cluster.
Note
Configuring the parameters of the NebulaGraph cluster via Helm isn't currently supported.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#prerequisites","title":"Prerequisites","text":"A cluster is created using NebulaGraph Operator. For details, see Create a NebulaGraph Cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#configuration_method","title":"Configuration method","text":"You can update the configurations of cluster services by customizing parameters through spec.<metad|graphd|storaged>.config
. NebulaGraph Operator loads the configurations from config
into the corresponding service's ConfigMap, which is then mounted into the service's configuration file directory (/usr/local/nebula/etc/
) at the time of the service launch.
The structure of config
is as follows:
Config map[string]string `json:\"config,omitempty\"`\n
For instance, when updating the Graph service's enable_authorize
parameter settings, the spec.graphd.config
parameter can be specified at the time of cluster creation, or during cluster runtime.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n graphd:\n ...\n config: // Custom-defined parameters for the Graph service.\n \"enable_authorize\": \"true\" // Enable authorization. Default value is false.\n...\n
If you need to configure config
for the Meta and Storage services, add corresponding configuration items to spec.metad.config
and spec.storaged.config
.
For more detailed information on the parameters that can be set under the config
field, see the following:
Configuration parameters for cluster services fall into two categories: those which require a service restart for any updates; and those which can be dynamically updated during service runtime. For the latter type, the updates will not be saved; subsequent to a service restart, configurations will revert to the state as shown in the configuration file.
Regarding if the configuration parameters support dynamic updates during service runtime, please verify the information within the Whether supports runtime dynamic modifications column on each of the service configuration parameter detail pages linked above or see Dynamic runtime flags.
During the update of cluster service configurations, keep the following points in mind:
config
all allow for dynamic runtime updates, a service Pod restart will not be triggered and the configuration parameter updates will not be saved.config
include one or more that don\u2019t allow for dynamic runtime updates, a service Pod restart will be triggered, but only updates to those parameters that don\u2019t allow for dynamic updates will be saved.Note
If you wish to modify the parameter settings during cluster runtime without triggering a Pod restart, make sure that all the parameters support dynamic updates during runtime.
"},{"location":"k8s-operator/4.cluster-administration/4.2.configuration/#customize_port_configuration","title":"Customize port configuration","text":"The following example demonstrates how to customize the port configurations for the Meta, Storage, and Graph services.
You can add port
and ws_http_port
parameters to the config
field in order to set custom ports. For detailed information regarding these two parameters, see the networking configuration sections at Meta Service Configuration Parameters, Storage Service Configuration Parameters, Graph Service Configuration Parameters.
Note
port
and ws_http_port
parameter settings, a Pod restart is triggered and then the updated settings take effect after the restart.port
parameter.Modify the cluster configuration file.
Open the cluster configuration file.
kubectl edit nc nebula\n
Modify the configuration file as follows.
Add the config
field to the graphd
, metad
, and storaged
sections to customize the port configurations for the Graph, Meta, and Storage services, respectively.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n graphd:\n config: // Custom port configuration for the Graph service.\n port: \"3669\"\n ws_http_port: \"8080\"\n resources:\n requests:\n cpu: \"200m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-graphd\n version: master\n metad: \n config: // Custom port configuration for the Meta service.\n ws_http_port: 8081\n resources:\n requests:\n cpu: \"300m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-metad\n version: master\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-path\n storaged: \n config: // Custom port configuration for the Storage service.\n ws_http_port: 8082\n resources:\n requests:\n cpu: \"300m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n replicas: 1\n image: vesoft/nebula-storaged\n version: master\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-path\n enableAutoBalance: true\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n imagePullPolicy: IfNotPresent\n imagePullSecrets:\n - name: nebula-image\n enablePVReclaim: true\n topologySpreadConstraints:\n - topologyKey: kubernetes.io/hostname\n whenUnsatisfiable: \"ScheduleAnyway\"\n
Save the changes.
Changes will be saved automatically after saving the file.
Esc
to enter command mode.:wq
to save and exit.Validate that the configurations have taken effect.
kubectl get svc\n
Example output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE\nnebula-graphd-headless ClusterIP None <none> 3669/TCP,8080/TCP 10m\nnebula-graphd-svc ClusterIP 10.102.13.115 <none> 3669/TCP,8080/TCP 10m\nnebula-metad-headless ClusterIP None <none> 9559/TCP,8081/TCP 11m\nnebula-storaged-headless ClusterIP None <none> 9779/TCP,8082/TCP,9778/TCP 11m\n
As can be noticed, the Graph service's RPC daemon port is changed to 3669
(default 9669
), the HTTP port to 8080
(default 19669
); the Meta service's HTTP port is changed to 8081
(default 19559
); the Storage service's HTTP port is changed to 8082
(default 19779
).
Running logs of NebulaGraph cluster services (graphd, metad, storaged) are generated and stored in the /usr/local/nebula/logs
directory of each service container by default.
To view the running logs of a NebulaGraph cluster, you can use the kubectl logs
command.
For example, to view the running logs of the Storage service:
// View the name of the Storage service Pod, nebula-storaged-0.\n$ kubectl get pods -l app.kubernetes.io/component=storaged\nNAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 45h\n...\n\n// Enter the container storaged of the Storage service.\n$ kubectl exec -it nebula-storaged-0 -c storaged -- /bin/bash\n\n// View the running logs of the Storage service.\n$ cd /usr/local/nebula/logs\n
"},{"location":"k8s-operator/4.cluster-administration/4.5.logging/#clean_logs","title":"Clean logs","text":"Running logs generated by cluster services during runtime will occupy disk space. To avoid occupying too much disk space, the NebulaGraph Operator uses a sidecar container to periodically clean and archive logs.
To facilitate log collection and management, each NebulaGraph service deploys a sidecar container responsible for collecting logs generated by the service container and sending them to the specified log disk. The sidecar container automatically cleans and archives logs using the logrotate tool.
In the YAML configuration file of the cluster instance, set spec.logRotate
to enable log rotation and set timestamp_in_logfile_name
to false
to disable the timestamp in the log file name to implement log rotation for the target service. The timestamp_in_logfile_name
parameter is configured under the spec.<graphd|metad|storaged>.config
field. By default, the log rotation feature is turned off. Here is an example of enabling log rotation for all services:
...\nspec:\n graphd:\n config:\n # Whether to include a timestamp in the log file name. \n # You must set this parameter to false to enable log rotation. \n # It is set to true by default.\n \"timestamp_in_logfile_name\": \"false\"\n metad:\n config:\n \"timestamp_in_logfile_name\": \"false\"\n storaged:\n config:\n \"timestamp_in_logfile_name\": \"false\"\n logRotate: # Log rotation configuration\n # The number of times a log file is rotated before being deleted.\n # The default value is 5, and 0 means the log file will not be rotated before being deleted.\n rotate: 5\n # The log file is rotated only if it grows larger than the specified size. The default value is 200M.\n size: \"200M\"\n
"},{"location":"k8s-operator/4.cluster-administration/4.5.logging/#collect_logs","title":"Collect logs","text":"If you don't want to mount additional log disks to back up log files, or if you want to collect logs and send them to a log center using services like fluent-bit, you can configure logs to be output to standard error. The Operator uses the glog tool to log to standard error output.
Note
Currently, NebulaGraph Operator only collects standard error logs.
In the YAML configuration file of the cluster instance, you can configure logging to standard error output in the config
and env
fields of each service.
...\nspec:\n graphd:\n config:\n # Whether to redirect standard error to a separate output file. The default value is false, which means it is not redirected.\n redirect_stdout: \"false\"\n # The severity level of log content: INFO, WARNING, ERROR, and FATAL. The corresponding values are 0, 1, 2, and 3.\n stderrthreshold: \"0\"\n env: \n - name: GLOG_logtostderr # Write log to standard error output instead of a separate file.\n value: \"1\" # 1 represents writing to standard error output, and 0 represents writing to a file.\n image: vesoft/nebula-graphd\n replicas: 1\n resources:\n requests:\n cpu: 500m\n memory: 500Mi\n service:\n externalTrafficPolicy: Local\n type: NodePort\n version: vmaster\n metad:\n config:\n redirect_stdout: \"false\"\n stderrthreshold: \"0\"\n dataVolumeClaim:\n resources:\n requests:\n storage: 1Gi\n storageClassName: ebs-sc\n env:\n - name: GLOG_logtostderr\n value: \"1\"\n image: vesoft/nebula-metad\n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.1.cluster-install/","title":"Install a NebulaGraph cluster using NebulaGraph Operator","text":"Using NebulaGraph Operator to install NebulaGraph clusters enables automated cluster management with automatic error recovery. This topic covers two methods, kubectl apply
and helm
, for installing clusters using NebulaGraph Operator.
Historical version compatibility
NebulaGraph Operator versions 1.x are not compatible with NebulaGraph versions below 3.x.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.1.cluster-install/#prerequisites","title":"Prerequisites","text":"kubectl apply
","text":"Create a namespace for storing NebulaGraph cluster-related resources. For example, create the nebula
namespace.
kubectl create namespace nebula\n
Create a YAML configuration file nebulacluster.yaml
for the cluster. For example, create a cluster named nebula
.
nebula
cluster apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\n namespace: default\nspec:\n # Control the Pod scheduling strategy.\n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n # Enable PV recycling.\n enablePVReclaim: false\n # Enable monitoring.\n exporter:\n image: vesoft/nebula-stats-exporter\n version: v3.3.0\n replicas: 1\n maxRequests: 20\n # Custom Agent image for cluster backup and restore, and log cleanup.\n agent:\n image: vesoft/nebula-agent\n version: latest\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\" \n # Configure the image pull policy.\n imagePullPolicy: Always\n # Select the nodes for Pod scheduling.\n nodeSelector:\n nebula: cloud\n # Dependent controller name.\n reference:\n name: statefulsets.apps\n version: v1\n # Scheduler name.\n schedulerName: default-scheduler \n # Start NebulaGraph Console service for connecting to the Graph service.\n console:\n image: vesoft/nebula-console\n version: nightly\n username: \"demo\"\n password: \"test\" \n # Graph service configuration. \n graphd:\n # Used to check if the Graph service is running normally.\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19669\n # scheme: HTTP\n # initialDelaySeconds: 40\n # periodSeconds: 10\n # successThreshold: 1\n # timeoutSeconds: 10\n # Container image for the Graph service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n # Storage class name for storing Graph service logs.\n storageClassName: local-sc\n # Number of replicas for the Graph service Pod.\n replicas: 1\n # Resource configuration for the Graph service.\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n # Version of the Graph service.\n version: vmaster\n # Custom flags configuration for the Graph service.\n config: {}\n # Meta service configuration.\n metad:\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19559\n # scheme: HTTP\n # initialDelaySeconds: 5\n # periodSeconds: 5\n # successThreshold: 1\n # timeoutSeconds: 5\n # Container image for the Meta service.\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n # Custom flags configuration for the Meta service.\n config: {} \n # Storage service configuration.\n storaged:\n # readinessProbe:\n # failureThreshold: 3\n # httpGet:\n # path: /status\n # port: 19779\n # scheme: HTTP\n # initialDelaySeconds: 40\n # periodSeconds: 10\n # successThreshold: 1\n # timeoutSeconds: 5\n # Container image for the Storage service.\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-sc\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: vmaster\n # Custom flags configuration for the Storage service.\n config: {} \n
Expand to view all configurable parameters and descriptions Parameter Default Value Description metadata.name
- The name of the created NebulaGraph cluster. spec.console
- Launches a Console container for connecting to the Graph service. For configuration details, see nebula-console. spec.topologySpreadConstraints
- Controls the scheduling strategy for Pods. For more details, see Topology Spread Constraints. When the value of topologyKey
is kubernetes.io/zone
, the value of whenUnsatisfiable
must be set to DoNotSchedule
, and the value of spec.schedulerName
should be nebula-scheduler
. spec.graphd.replicas
1
The number of replicas for the Graphd service. spec.graphd.image
vesoft/nebula-graphd
The container image for the Graphd service. spec.graphd.version
master
The version of the Graphd service. spec.graphd.service
Configuration for accessing the Graphd service via a Service. spec.graphd.logVolumeClaim.storageClassName
- The storage class name for the log volume claim of the Graphd service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.metad.replicas
1
The number of replicas for the Metad service. spec.metad.image
vesoft/nebula-metad
The container image for the Metad service. spec.metad.version
master
The version of the Metad service. spec.metad.dataVolumeClaim.storageClassName
- Storage configuration for the data disk of the Metad service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.metad.logVolumeClaim.storageClassName
- Storage configuration for the log disk of the Metad service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.storaged.replicas
3
The number of replicas for the Storaged service. spec.storaged.image
vesoft/nebula-storaged
The container image for the Storaged service. spec.storaged.version
master
The version of the Storaged service. spec.storaged.dataVolumeClaims.resources.requests.storage
- The storage size for the data disk of the Storaged service. You can specify multiple data disks. When specifying multiple data disks, the paths are like /usr/local/nebula/data1
, /usr/local/nebula/data2
, and so on. spec.storaged.dataVolumeClaims.storageClassName
- Storage configuration for the data disks of the Storaged service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.storaged.logVolumeClaim.storageClassName
- Storage configuration for the log disk of the Storaged service. When using sample configuration, replace it with the name of the pre-created storage class. See Storage Classes for creating a storage class. spec.<metad|storaged|graphd>.securityContext
{}
Defines the permission and access control for the cluster containers to control access and execution of container operations. For details, see SecurityContext. spec.agent
{}
Configuration for the Agent service used for backup and recovery, and log cleaning functions. If you don't customize this configuration, the default configuration is used. spec.reference.name
{}
The name of the controller it depends on. spec.schedulerName
default-scheduler
The name of the scheduler. spec.imagePullPolicy
Always
The image pull policy for NebulaGraph images. For more details on pull policies, please see Image pull policy. spec.logRotate
{}
Log rotation configuration. For details, see Managing Cluster Logs. spec.enablePVReclaim
false
Defines whether to automatically delete PVCs after deleting the cluster to release data. For details, see Reclaim PV. spec.metad.licenseManagerURL
- Configures the URL pointing to the License Manager (LM), consisting of the access address and port (default port 9119
). For example, 192.168.8.xxx:9119
. For creating the NebulaGraph Enterprise Edition only. spec.storaged.enableAutoBalance
false
Whether to enable automatic balancing. For details, see Balancing Storage Data After Scaling Out. spec.enableBR
false
Defines whether to enable the BR tool. For details, see Backup and Restore. spec.imagePullSecrets
[]
Defines the Secret required to pull images from a private repository. Create the NebulaGraph cluster.
kubectl create -f nebulacluster.yaml -n nebula\n
Output:
nebulacluster.apps.nebula-graph.io/nebula created\n
If you don't specify the namespace using -n
, it will default to the default
namespace.
Check the status of the NebulaGraph cluster.
kubectl get nebulaclusters nebula -n nebula\n
Output:
NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\nnebula True 1 1 1 1 1 1 86s\n
helm
","text":"Add the NebulaGraph Operator Helm repository (if it's already added, run the next step directly).
helm repo add nebula-operator https://vesoft-inc.github.io/nebula-operator/charts\n
Update the Helm repository to fetch the latest resources.
helm repo update nebula-operator\n
Set environment variables for the configuration parameters required for installing the cluster.
export NEBULA_CLUSTER_NAME=nebula # Name of the NebulaGraph cluster.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # Namespace for the NebulaGraph cluster.\nexport STORAGE_CLASS_NAME=local-sc # StorageClass for the NebulaGraph cluster.\n
Create a namespace for the NebulaGraph cluster if it is not created.
kubectl create namespace \"${NEBULA_CLUSTER_NAMESPACE}\"\n
Check the customizable configuration parameters for the nebula-cluster
Helm chart of the nebula-operator
when creating the cluster.
Run the following command to view all the configurable parameters.
helm show values nebula-operator/nebula-cluster\n
Example to view all configurable parameters nebula:\n version: master\n imagePullPolicy: Always\n storageClassName: \"\"\n enablePVReclaim: false\n enableBR: false\n enableForceUpdate: false\n schedulerName: default-scheduler \n topologySpreadConstraints:\n - topologyKey: \"kubernetes.io/hostname\"\n whenUnsatisfiable: \"ScheduleAnyway\"\n logRotate: {}\n reference:\n name: statefulsets.apps\n version: v1\n graphd:\n image: vesoft/nebula-graphd\n replicas: 2\n serviceType: NodePort\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"500Mi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n metad:\n image: vesoft/nebula-metad\n replicas: 3\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n dataVolume:\n storage: \"2Gi\"\n licenseManagerURL: \"\"\n license: {}\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n storaged:\n image: vesoft/nebula-storaged\n replicas: 3\n env: []\n config: {}\n resources:\n requests:\n cpu: \"500m\"\n memory: \"500Mi\"\n limits:\n cpu: \"1\"\n memory: \"1Gi\"\n logVolume:\n enable: true\n storage: \"500Mi\"\n dataVolumes:\n - storage: \"10Gi\"\n enableAutoBalance: false\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n\n exporter:\n image: vesoft/nebula-stats-exporter\n version: v3.3.0\n replicas: 1\n env: []\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\"\n podLabels: {}\n podAnnotations: {}\n securityContext: {}\n nodeSelector: {}\n tolerations: []\n affinity: {}\n readinessProbe: {}\n livenessProbe: {}\n initContainers: []\n sidecarContainers: []\n volumes: []\n volumeMounts: []\n maxRequests: 20\n\n agent:\n image: vesoft/nebula-agent\n version: latest\n resources:\n requests:\n cpu: \"100m\"\n memory: \"128Mi\"\n limits:\n cpu: \"200m\"\n memory: \"256Mi\"\n\n console:\n username: root\n password: nebula\n image: vesoft/nebula-console\n version: latest\n nodeSelector: {}\n\n alpineImage: \"\"\n\nimagePullSecrets: []\nnameOverride: \"\"\nfullnameOverride: \"\" \n
Expand to view parameter descriptions Parameter Default Value Description nebula.version
master Version of the cluster. nebula.imagePullPolicy
Always
Container image pull policy. Always
means always attempting to pull the latest image from the remote. nebula.storageClassName
\"\"
Name of the Kubernetes storage class for dynamic provisioning of persistent volumes. nebula.enablePVReclaim
false
Enable persistent volume reclaim. See Reclaim PV for details. nebula.enableBR
false
Enable the backup and restore feature. See Backup and Restore with NebulaGraph Operator for details. nebula.enableForceUpdate
false
Force update the Storage service without transferring the leader partition replicas. See Optimize leader transfer in rolling updates for details. nebula.schedulerName
default-scheduler
Name of the Kubernetes scheduler. Must be configured as nebula-scheduler
when using the Zone feature. nebula.topologySpreadConstraints
[]
Control the distribution of pods in the cluster. nebula.logRotate
{}
Log rotation configuration. See Manage cluster logs for details. nebula.reference
{\"name\": \"statefulsets.apps\", \"version\": \"v1\"}
The workload referenced for a NebulaGraph cluster. nebula.graphd.image
vesoft/nebula-graphd
Container image for the Graph service. nebula.graphd.replicas
2
Number of replicas for the Graph service. nebula.graphd.serviceType
NodePort
Service type for the Graph service, defining how the Graph service is accessed. See Connect to the Cluster for details. nebula.graphd.env
[]
Container environment variables for the Graph service. nebula.graphd.config
{}
Configuration for the Graph service. See Customize the configuration of the NebulaGraph cluster for details. nebula.graphd.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"500Mi\"}}}
Resource limits and requests for the Graph service. nebula.graphd.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Graph service. When enable
is false
, log volume is not used. nebula.metad.image
vesoft/nebula-metad
Container image for the Meta service. nebula.metad.replicas
3
Number of replicas for the Meta service. nebula.metad.env
[]
Container environment variables for the Meta service. nebula.metad.config
{}
Configuration for the Meta service. See Customize the configuration of the NebulaGraph cluster for details. nebula.metad.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}}}
Resource limits and requests for the Meta service. nebula.metad.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Meta service. When enable
is false
, log volume is not used. nebula.metad.dataVolume
{\"dataVolume\": {\"storage\": \"2Gi\"}}
Data storage configuration for the Meta service. nebula.metad.licenseManagerURL
\"\"
URL for the license manager (LM) to obtain license information. For creating the NebulaGraph Enterprise Edition only. nebula.storaged.image
vesoft/nebula-storaged
Container image for the Storage service. nebula.storaged.replicas
3
Number of replicas for the Storage service. nebula.storaged.env
[]
Container environment variables for the Storage service. nebula.storaged.config
{}
Configuration for the Storage service. See Customize the configuration of the NebulaGraph cluster for details. nebula.storaged.resources
{\"resources\":{\"requests\":{\"cpu\":\"500m\",\"memory\":\"500Mi\"},\"limits\":{\"cpu\":\"1\",\"memory\":\"1Gi\"}}}
Resource limits and requests for the Storage service. nebula.storaged.logVolume
{\"logVolume\": {\"enable\": true,\"storage\": \"500Mi\"}}
Log storage configuration for the Storage service. When enable
is false
, log volume is not used. nebula.storaged.dataVolumes
{\"dataVolumes\": [{\"storage\": \"10Gi\"}]}
Data storage configuration for the Storage service. Supports specifying multiple data volumes. nebula.storaged.enableAutoBalance
false
Enable automatic balancing. See Balance storage data after scaling out for details. nebula.exporter.image
vesoft/nebula-stats-exporter
Container image for the Exporter service. nebula.exporter.version
v3.3.0
Version of the Exporter service. nebula.exporter.replicas
1
Number of replicas for the Exporter service. nebula.exporter.env
[]
Environment variables for the Exporter service. nebula.exporter.resources
{\"resources\":{\"requests\":{\"cpu\":\"100m\",\"memory\":\"128Mi\"},\"limits\":{\"cpu\":\"200m\",\"memory\":\"256Mi\"}}}
Resource limits and requests for the Exporter service. nebula.agent.image
vesoft/nebula-agent
Container image for the agent service. nebula.agent.version
latest
Version of the agent service. nebula.agent.resources
{\"resources\":{\"requests\":{\"cpu\":\"100m\",\"memory\":\"128Mi\"},\"limits\":{\"cpu\":\"200m\",\"memory\":\"256Mi\"}}}
Resource limits and requests for the agent service. nebula.console.username
root
Username for accessing the NebulaGraph Console client. See Connect to the cluster for details. nebula.console.password
nebula
Password for accessing the NebulaGraph Console client. nebula.console.image
vesoft/nebula-console
Container image for the NebulaGraph Console client. nebula.console.version
latest
Version of the NebulaGraph Console client. nebula.alpineImage
\"\"
Alpine Linux container image used to obtain zone information for nodes. imagePullSecrets
[]
Names of secrets to pull private images. nameOverride
\"\"
Cluster name. fullnameOverride
\"\"
Name of the released chart instance. nebula.<graphd|metad|storaged|exporter>.podLabels
{}
Additional labels to be added to the pod. nebula.<graphd|metad|storaged|exporter>.podAnnotations
{}
Additional annotations to be added to the pod. nebula.<graphd|metad|storaged|exporter>.securityContext
{}
Security context for setting pod-level security attributes, including user ID, group ID, Linux Capabilities, etc. nebula.<graphd|metad|storaged|exporter>.nodeSelector
{}
Label selectors for determining which nodes to run the pod on. nebula.<graphd|metad|storaged|exporter>.tolerations
[]
Tolerations allow a pod to be scheduled to nodes with specific taints. nebula.<graphd|metad|storaged|exporter>.affinity
{}
Affinity rules for the pod, including node affinity, pod affinity, and pod anti-affinity. nebula.<graphd|metad|storaged|exporter>.readinessProbe
{}
Probe to check if a container is ready to accept service requests. When the probe returns success, traffic can be routed to the container. nebula.<graphd|metad|storaged|exporter>.livenessProbe
{}
Probe to check if a container is still running. If the probe fails, Kubernetes will kill and restart the container. nebula.<graphd|metad|storaged|exporter>.initContainers
[]
Special containers that run before the main application container starts, typically used for setting up the environment or initializing data. nebula.<graphd|metad|storaged|exporter>.sidecarContainers
[]
Containers that run alongside the main application container, typically used for auxiliary tasks such as log processing, monitoring, etc. nebula.<graphd|metad|storaged|exporter>.volumes
[]
Storage volumes to be attached to the service pod. nebula.<graphd|metad|storaged|exporter>.volumeMounts
[]
Specifies where to mount the storage volume inside the container. Create the NebulaGraph cluster.
You can use the --set
flag to customize the default values of the NebulaGraph cluster configuration. For example, --set nebula.storaged.replicas=3
sets the number of replicas for the Storage service to 3.
helm install \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\ \n # Specify the version of the cluster chart. If not specified, it will install the latest version by default.\n # You can check all chart versions by running the command: helm search repo -l nebula-operator/nebula-cluster\n --version=1.8.0 \\\n # Specify the namespace for the NebulaGraph cluster.\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n # Customize the cluster name.\n --set nameOverride=\"${NEBULA_CLUSTER_NAME}\" \\\n --set nebula.storageClassName=\"${STORAGE_CLASS_NAME}\" \\\n # Specify the version for the NebulaGraph cluster.\n --set nebula.version=vmaster\n
Check the status of NebulaGraph cluster pods.
kubectl -n \"${NEBULA_CLUSTER_NAMESPACE}\" get pod -l \"app.kubernetes.io/cluster=${NEBULA_CLUSTER_NAME}\"\n
Output:
NAME READY STATUS RESTARTS AGE\nnebula-exporter-854c76989c-mp725 1/1 Running 0 14h\nnebula-graphd-0 1/1 Running 0 14h\nnebula-graphd-1 1/1 Running 0 14h\nnebula-metad-0 1/1 Running 0 14h\nnebula-metad-1 1/1 Running 0 14h\nnebula-metad-2 1/1 Running 0 14h\nnebula-storaged-0 1/1 Running 0 14h\nnebula-storaged-1 1/1 Running 0 14h\nnebula-storaged-2 1/1 Running 0 14h\n
This topic introduces how to upgrade a NebulaGraph cluster created with NebulaGraph Operator.
Legacy version compatibility
The 1.x version NebulaGraph Operator is not compatible with NebulaGraph of version below v3.x.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.2.cluster-upgrade/#limits","title":"Limits","text":"You have created a NebulaGraph cluster. For details, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.2.cluster-upgrade/#upgrade_a_nebulagraph_cluster_with_kubectl","title":"Upgrade a NebulaGraph cluster withkubectl
","text":"The following steps upgrade a NebulaGraph cluster from version 3.5.0
to master
.
Check the image version of the services in the cluster.
kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n
Output:
1 vesoft/nebula-graphd:3.5.0\n 1 vesoft/nebula-metad:3.5.0\n 3 vesoft/nebula-storaged:3.5.0 \n
Edit the nebula
cluster configuration to change the version
value of the cluster services from 3.5.0 to master.
Open the YAML file for the nebula
cluster.
kubectl edit nebulacluster nebula -n <namespace>\n
Change the value of version
.
After making these changes, the YAML file should look like this:
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\nspec:\n graphd:\n version: master // Change the value from 3.5.0 to master.\n ...\n metad:\n version: master // Change the value from 3.5.0 to master.\n ...\n storaged:\n version: master // Change the value from 3.5.0 to master.\n ...\n
Apply the configuration.
After saving the YAML file and exiting, Kubernetes automatically updates the cluster's configuration and starts the cluster upgrade.
After waiting for about 2 minutes, run the following command to see if the image versions of the services in the cluster have been changed to master.
kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n
Output:
1 vesoft/nebula-graphd:master\n 1 vesoft/nebula-metad:master\n 3 vesoft/nebula-storaged:master \n
helm
","text":"Update the information of available charts locally from chart repositories.
helm repo update\n
Set environment variables to your desired values.
export NEBULA_CLUSTER_NAME=nebula # The desired NebulaGraph cluster name.\nexport NEBULA_CLUSTER_NAMESPACE=nebula # The desired namespace where your NebulaGraph cluster locates.\n
Upgrade a NebulaGraph cluster.
For example, upgrade a cluster to master.
helm upgrade \"${NEBULA_CLUSTER_NAME}\" nebula-operator/nebula-cluster \\\n --namespace=\"${NEBULA_CLUSTER_NAMESPACE}\" \\\n --set nameOverride=${NEBULA_CLUSTER_NAME} \\\n --set nebula.version=master\n
The value of --set nebula.version
specifies the version of the cluster you want to upgrade to.
Run the following command to check the status and version of the upgraded cluster.
Check cluster status:
$ kubectl -n \"${NEBULA_CLUSTER_NAMESPACE}\" get pod -l \"app.kubernetes.io/cluster=${NEBULA_CLUSTER_NAME}\"\nNAME READY STATUS RESTARTS AGE\nnebula-graphd-0 1/1 Running 0 2m\nnebula-graphd-1 1/1 Running 0 2m\nnebula-metad-0 1/1 Running 0 2m\nnebula-metad-1 1/1 Running 0 2m\nnebula-metad-2 1/1 Running 0 2m\nnebula-storaged-0 1/1 Running 0 2m\nnebula-storaged-1 1/1 Running 0 2m\nnebula-storaged-2 1/1 Running 0 2m\n
Check cluster version:
$ kubectl get pods -l app.kubernetes.io/cluster=nebula -o jsonpath=\"{.items[*].spec.containers[*].image}\" |tr -s '[[:space:]]' '\\n' |sort |uniq -c\n 1 vesoft/nebula-graphd:master\n 1 vesoft/nebula-metad:master\n 3 vesoft/nebula-storaged:master\n
The upgrade process of a cluster is a rolling update process and can be time-consuming due to the state transition of the leader partition replicas in the Storage service. You can configure the enableForceUpdate
field in the cluster instance's YAML file to skip the leader partition replica transfer operation, thereby accelerating the upgrade process. For more information, see Specify a rolling update strategy.
If you encounter issues during the upgrade process, you can check the logs of the cluster service pods.
kubectl logs <pod-name> -n <namespace>\n
Additionally, you can inspect the cluster's status and events.
kubectl describe nebulaclusters <cluster-name> -n <namespace>\n
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.3.cluster-uninstall/","title":"Delete a NebulaGraph cluster","text":"This topic explains how to delete a NebulaGraph cluster created using NebulaGraph Operator.
"},{"location":"k8s-operator/4.cluster-administration/4.1.installation/4.1.3.cluster-uninstall/#usage_limitations","title":"Usage limitations","text":"kubectl
","text":"View all created clusters.
kubectl get nc --all-namespaces\n
Example output:
NAMESPACE NAME READY GRAPHD-DESIRED GRAPHD-READY METAD-DESIRED METAD-READY STORAGED-DESIRED STORAGED-READY AGE\ndefault nebula True 2 2 3 3 3 3 38h\nnebula nebula2 True 1 1 1 1 1 1 2m7s\n
Delete a cluster. For example, run the following command to delete a cluster named nebula2
:
kubectl delete nc nebula2 -n nebula\n
Example output:
nebulacluster.nebula-graph.io \"nebula2\" deleted\n
Confirm the deletion.
kubectl get nc nebula2 -n nebula\n
Example output:
No resources found in nebula namespace.\n
helm
","text":"View all Helm releases.
helm list --all-namespaces\n
Example output:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION\nnebula default 1 2023-11-06 20:16:07.913136377 +0800 CST deployed nebula-cluster-1.7.1 1.7.1\nnebula-operator nebula-operator-system 3 2023-11-06 12:06:24.742397418 +0800 CST deployed nebula-operator-1.7.1 1.7.1\n
View detailed information about a Helm release. For example, to view the cluster information for a Helm release named nebula
:
helm get values nebula -n default\n
Example output:
USER-SUPPLIED VALUES:\nimagePullSecrets:\n- name: secret_for_pull_image\nnameOverride: nebula # The cluster name\nnebula:\n graphd:\n image: reg.vesoft-inc.com/xx\n metad:\n image: reg.vesoft-inc.com/xx\n licenseManagerURL: xxx:9119\n storageClassName: local-sc\n storaged:\n image: reg.vesoft-inc.com/xx\n version: v1.8.0 # The cluster version\n
Uninstall a Helm release. For example, to uninstall a Helm release named nebula
:
helm uninstall nebula -n default\n
Example output:
release \"nebula\" uninstalled\n
Once the Helm release is uninstalled, NebulaGraph Operator will automatically remove all K8s resources associated with that release.
Verify that the cluster resources are removed.
kubectl get nc nebula -n default\n
Example output:
No resources found in default namespace.\n
Local Persistent Volumes, abbreviated as Local PVs in K8s store container data directly using the node's local disk directory. Compared with network storage, Local Persistent Volumes provide higher IOPS and lower read and write latency, which is suitable for data-intensive applications. This topic introduces how to use Local PVs in Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS) clusters, and how to enable automatic failover for Local PVs in the cloud.
While using Local Persistent Volumes can enhance performance, it's essential to note that, unlike network storage, local storage does not support automatic backup. In the event of a node failure, all data in local storage may be lost. Therefore, the utilization of Local Persistent Volumes involves a trade-off between service availability, data persistence, and flexibility.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.1.use-local-pv/#principles","title":"Principles","text":"NebulaGraph Operator implements a Storage Volume Provisioner interface to automatically create and delete PV objects. Utilizing the provisioner, you can dynamically generate Local PVs as required. Based on the PVC and StorageClass specified in the cluster configuration file, NebulaGraph Operator automatically generates PVCs and associates them with their respective Local PVs.
When a Local PV is initiated by the provisioner interface, the provisioner controller generates a local
type PV and configures the nodeAffinity
field. This configuration ensures that Pods using the local
type PV are scheduled onto specific nodes. Conversely, when a Local PV is deleted, the provisioner controller eliminates the local
type PV object and purges the node's storage resources.
NebulaGraph Operator is installed. For details, see Install NebulaGraph Operator.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.1.use-local-pv/#steps","title":"Steps","text":"The resources in the following examples are all created in the default
namespace.
Create a node pool with local SSDs if not existing
gcloud container node-pools create \"pool-1\" --cluster \"gke-1\" --region us-central1 --node-version \"1.27.10-gke.1055000\" --machine-type \"n2-standard-2\" --local-nvme-ssd-block count=2 --max-surge-upgrade 1 --max-unavailable-upgrade 0 --num-nodes 1 --enable-autoscaling --min-nodes 1 --max-nodes 2\n
For information about the parameters to create a node pool with local SSDs, see Create a node pool with Local SSD.
Format and mount the local SSDs using a DaemonSet.
Download the gke-daemonset-raid-disks.yaml file.
Deploy the RAID disks DaemonSet. The DaemonSet sets a RAID 0
array on all Local SSD disks and formats the device to an ext4
filesystem.
kubectl apply -f gke-daemonset-raid-disks.yaml\n
Deploy the Local PV provisioner.
kubectl apply -f local-pv-provisioner.yaml\n
In the NebulaGraph cluster configuration file, specify spec.storaged.dataVolumeClaims
or spec.metad.dataVolumeClaim
, and the StorageClass needs to be configured as local-nvme
. For more information about cluster configurations, see Create a NebulaGraph cluster.
...\nmetad: \n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme\nstoraged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme \n...\n
After the NebulaGraph is deployed, the Local PVs are automatically created.
View the PV list.
kubectl get pv\n
Return:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE\npvc-01be9b75-9c50-4532-8695-08e11b489718 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 local-nvme 3m35s\npvc-09de8eb1-1225-4025-b91b-fbc0bcce670f 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-1 local-nvme 3m35s\npvc-4b2a9ffb-9000-4998-a7bb-edb825c872cb 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-2 local-nvme 3m35s\n...\n
View the detailed information of the PV.
kubectl get pv pvc-01be9b75-9c50-4532-8695-08e11b489718 -o yaml\n
Return:
apiVersion: v1\nkind: PersistentVolume\nmetadata:\n annotations:\n local.pv.provisioner/selected-node: gke-snap-test-snap-test-591403a8-xdfc\n nebula-graph.io/pod-name: nebula-storaged-0\n pv.kubernetes.io/provisioned-by: nebula-cloud.io/local-pv\n creationTimestamp: \"2024-03-05T06:12:32Z\"\n finalizers:\n - kubernetes.io/pv-protection\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: storaged\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: pvc-01be9b75-9c50-4532-8695-08e11b489718\n resourceVersion: \"9999469\"\n uid: ee28a4da-6026-49ac-819b-2075154b4724\nspec:\n accessModes:\n - ReadWriteOnce\n capacity:\n storage: 5Gi\n claimRef:\n apiVersion: v1\n kind: PersistentVolumeClaim\n name: storaged-data-nebula-storaged-0\n namespace: default\n resourceVersion: \"9996541\"\n uid: 01be9b75-9c50-4532-8695-08e11b489718\n local:\n fsType: ext4\n path: /mnt/disks/raid0\n nodeAffinity:\n required:\n nodeSelectorTerms:\n - matchExpressions:\n - key: kubernetes.io/hostname\n operator: In\n values:\n - gke-snap-test-snap-test-591403a8-xdfc\n persistentVolumeReclaimPolicy: Delete\n storageClassName: local-nvme\n volumeMode: Filesystem\nstatus:\n phase: Bound \n
Create a node pool with Instance Store if not existing.
eksctl create nodegroup --instance-types m5ad.2xlarge --nodes 3 --cluster eks-1\n
For more information about parameters to cluster node pools, see Creating a managed node group.
Format and mount the local SSDs using a DaemonSet.
Download the eks-daemonset-raid-disks.yaml file.
Based on the node type created in step 1, modify the value of the nodeSelector.node.kubernetes.io/instance-type
field in the eks-daemonset-raid-disks.yaml
file as needed.
spec:\n nodeSelector:\n node.kubernetes.io/instance-type: \"m5ad.2xlarge\"\n
Install nvme-cli.
sudo apt-get update\nsudo apt-get install -y nvme-cli\n
sudo yum install -y nvme-cli\n
Deploy the RAID disk DaemonSet. The DaemonSet sets up a RAID 0
array on all local SSD disks and formats the devices as an ext4
file system.
kubectl apply -f gke-daemonset-raid-disks.yaml\n
Deploy the Local PV provisioner.
kubectl apply -f local-pv-provisioner.yaml\n
In the NebulaGraph cluster configuration file, specify spec.storaged.dataVolumeClaims
or spec.metad.dataVolumeClaim
, and the StorageClass needs to be configured as local-nvme
. For more information about cluster configurations, see Create a NebulaGraph cluster.
metad:\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme\nstoraged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: local-nvme \n
View the PV list.
kubectl get pv\n
Return:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE\npvc-290c15cc-a302-4463-a591-84b7217a6cd2 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 local-nvme 3m40s\npvc-fbb3167f-f556-4a16-ae0e-171aed0ac954 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-1 local-nvme 3m40s\npvc-6c7cfe80-0134-4573-b93e-9b259c6fcd63 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-2 local-nvme 3m40s\n...\n
View the detailed information of the PV.
kubectl get pv pvc-290c15cc-a302-4463-a591-84b7217a6cd2 -o yaml\n
Return:
apiVersion: v1\nkind: PersistentVolume\nmetadata:\n annotations:\n local.pv.provisioner/selected-node: ip-192-168-77-60.ec2.internal\n nebula-graph.io/pod-name: nebula-storaged-0\n pv.kubernetes.io/provisioned-by: nebula-cloud.io/local-pv\n creationTimestamp: \"2024-03-04T07:51:32Z\"\n finalizers:\n - kubernetes.io/pv-protection\n labels:\n app.kubernetes.io/cluster: nebula\n app.kubernetes.io/component: storaged\n app.kubernetes.io/managed-by: nebula-operator\n app.kubernetes.io/name: nebula-graph\n name: pvc-290c15cc-a302-4463-a591-84b7217a6cd2\n resourceVersion: \"7932689\"\n uid: 66c0a2d3-2914-43ad-93b5-6d84fb62acef\nspec:\n accessModes:\n - ReadWriteOnce\n capacity:\n storage: 5Gi\n claimRef:\n apiVersion: v1\n kind: PersistentVolumeClaim\n name: storaged-data-nebula-storaged-0\n namespace: default\n resourceVersion: \"7932688\"\n uid: 8ecb5d96-004b-4672-bac4-1355ae15eae4\n local:\n fsType: ext4\n path: /mnt/disks/raid0\n nodeAffinity:\n required:\n nodeSelectorTerms:\n - matchExpressions:\n - key: kubernetes.io/hostname\n operator: In\n values:\n - ip-192-168-77-60.ec2.internal\n persistentVolumeReclaimPolicy: Delete\n storageClassName: local-nvme\n volumeMode: Filesystem\nstatus:\n phase: Bound \n
When using network storage (e.g., AWS EBS, Google Cloud Persistent Disk, Azure Disk Storage, Ceph, NFS, etc.) as a PV, the storage resource is independent of any particular node. Therefore, the storage resource can be mounted and used by Pods regardless of the node to which the Pods are scheduled. However, when using a local storage disk as a PV, the storage resource can only be used by Pods on a specific node due to nodeAffinity.
The Storage service of NebulaGraph supports data redundancy, which allows you to set multiple odd-numbered partition replicas. When a node fails, the associated partition is automatically transferred to a healthy node. However, Storage Pods using Local Persistent Volumes cannot run on other nodes due to the node affinity setting and must wait for the node to recover. To run on another node, the Pods must be unbound from the associated Local Persistent Volume.
NebulaGraph Operator supports automatic failover in the event of a node failure while using Local Persistent Volumes in the cloud for elastic scaling. This is achieved by setting spec.enableAutoFailover
to true
in the cluster configuration file, which automatically unbinds the Pods from the Local Persistent Volume, allowing the Pods to run on another node.
Example configuration:
...\nspec:\n # Enable automatic failover for Local PV.\n enableAutoFailover: true\n # The time to wait for the Storage service to be in the `OFFLINE` status\n # before automatic failover. \n # The default value is 5 minutes.\n # If the Storage service recovers to the `ONLINE` status during this period,\n # failover will not be triggered.\n failoverPeriod: \"2m\"\n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.2.pv-expansion/","title":"Dynamically expand persistent volumes","text":"In a Kubernetes environment, NebulaGraph's data is stored on Persistent Volumes (PVs). Dynamic volume expansion refers to increasing the capacity of a volume without stopping the service, enabling NebulaGraph to accommodate growing data. This topic explains how to dynamically expand the PV for NebulaGraph services in a Kubernetes environment.
Note
In Kubernetes, a StorageClass is a resource that defines a particular storage type. It describes a class of storage, including its provisioner, parameters, and other details. When creating a PersistentVolumeClaim (PVC) and specifying a StorageClass, Kubernetes automatically creates a corresponding PV. The principle of dynamic volume expansion is to edit the PVC and increase the volume's capacity. Kubernetes will then automatically expand the capacity of the PV associated with this PVC based on the specified storageClassName
in the PVC. During this process, new PVs are not created; the size of the existing PV is changed. Only dynamic storage volumes, typically those associated with a storageClassName
, support dynamic volume expansion. Additionally, the allowVolumeExpansion
field in the StorageClass must be set to true
. For more details, see the Kubernetes documentation on expanding Persistent Volume Claims.
In NebulaGraph Operator, you cannot directly edit PVC because Operator automatically creates PVC based on the configuration in the spec.<metad|storaged>.dataVolumeClaim
of the Nebula Graph cluster. Therefore, you need to modify the cluster's configuration to update the PVC and trigger dynamic online volume expansion for the PV.
allowVolumeExpansion
field in the StorageClass is set to true
.provisioner
configured in the StorageClass supports dynamic expansion.In the following example, we assume that the StorageClass is named ebs-sc
and the NebulaGraph cluster is named nebula
. We will demonstrate how to dynamically expand the PV for the Storage service.
Check the status of the Storage service Pod:
kubectl get pod\n
Example output:
nebula-storaged-0 1/1 Running 0 43h\n
Check the PVC and PV information for the Storage service:
# View PVC \nkubectl get pvc\n
Example output:
storaged-data-nebula-storaged-0 Bound pvc-36ca3871-9265-460f-b812-7e73a718xxxx 5Gi RWO ebs-sc 43h\n
# View PV and confirm that the capacity of the PV is 5Gi\nkubectl get pv\n
Example output:
pvc-36ca3871-9265-460f-b812-xxx 5Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 ebs-sc 43h\n
Assuming all the above-mentioned prerequisites are met, use the following command to request an expansion of the PV for the Storage service to 10Gi:
kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"10Gi\"}}, \"storageClassName\": \"ebs-sc\"}]}}}'\n
Example output:
nebulacluster.apps.nebula-graph.io/nebula patched\n
After waiting for about a minute, check the expanded PVC and PV information:
kubectl get pvc\n
Example output:
storaged-data-nebula-storaged-0 Bound pvc-36ca3871-9265-460f-b812-7e73a718xxxx 10Gi RWO ebs-sc 43h\n
kubectl get pv\n
Example output:
pvc-36ca3871-9265-460f-b812-xxx 10Gi RWO Delete Bound default/storaged-data-nebula-storaged-0 ebs-sc 43h\n
As you can see, both the PVC and PV capacity have been expanded to 10Gi.
NebulaGraph Operator uses PVs (Persistent Volumes) and PVCs (Persistent Volume Claims) to store persistent data. If you accidentally deletes a NebulaGraph cluster, by default, PV and PVC objects and the relevant data will be retained to ensure data security.
You can also define the automatic deletion of PVCs to release data by setting the parameter spec.enablePVReclaim
to true
in the configuration file of the cluster instance. As for whether PV will be deleted automatically after PVC is deleted, you need to customize the PV reclaim policy. See reclaimPolicy in StorageClass and PV Reclaiming for details.
A NebulaGraph cluster is created in Kubernetes. For specific steps, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.4.storage-management/4.4.3.configure-pv-reclaim/#steps","title":"Steps","text":"The following example uses a cluster named nebula
and the cluster's configuration file named nebula_cluster.yaml
to show how to set enablePVReclaim
:
Run the following command to edit the nebula
cluster's configuration file.
kubectl edit nebulaclusters.apps.nebula-graph.io nebula\n
Add enablePVReclaim
and set its value to true
under spec
.
apiVersion: apps.nebula-graph.io/v1alpha1\nkind: NebulaCluster\nmetadata:\n name: nebula\nspec:\n enablePVReclaim: true //Set its value to true.\n graphd:\n image: vesoft/nebula-graphd\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n imagePullPolicy: IfNotPresent\n metad:\n dataVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n image: vesoft/nebula-metad\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 1\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n nodeSelector:\n nebula: cloud\n reference:\n name: statefulsets.apps\n version: v1\n schedulerName: default-scheduler\n storaged:\n dataVolumeClaims:\n - resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n - resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n image: vesoft/nebula-storaged\n logVolumeClaim:\n resources:\n requests:\n storage: 2Gi\n storageClassName: fast-disks\n replicas: 3\n resources:\n limits:\n cpu: \"1\"\n memory: 1Gi\n requests:\n cpu: 500m\n memory: 500Mi\n version: master\n... \n
Run kubectl apply -f nebula_cluster.yaml
to push your configuration changes to the cluster.
After setting enablePVReclaim
to true
, the PVCs of the cluster will be deleted automatically after the cluster is deleted. If you want to delete the PVs, you need to set the reclaim policy of the PVs to Delete
.
Kubernetes Admission Control is a security mechanism running as a webhook at runtime. It intercepts and modifies requests to ensure the cluster's security. Admission webhooks involve two main operations: validation and mutation. NebulaGraph Operator supports only validation operations and provides some default admission control rules. This topic describes NebulaGraph Operator's default admission control rules and how to enable admission control.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.2.enable-admission-control/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created with NebulaGrpah Operator. For detailed steps, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.2.enable-admission-control/#admission_control_rules","title":"Admission control rules","text":"Kubernetes admission control allows you to insert custom logic or policies before Kubernetes API Server processes requests. This mechanism can be used to implement various security policies, such as restricting a Pod's resource consumption or limiting its access permissions. NebulaGraph Operator supports validation operations, which means it validates and intercepts requests without making changes.
After admission control is enabled, NebulaGraph Operator implements the following admission validation control rules by default. You cannot disable these rules:
dataVolumeClaims
.After admission control is enabled, NebulaGraph Operator allows you to add annotations to implement the following admission validation control rules:
Clusters with the ha-mode
annotation must have the minimum number of replicas as required by high availability mode:
Note
High availability mode refers to the high availability of NebulaGraph cluster services. Storage and Meta services are stateful, and the number of replicas should be an odd number due to Raft protocol requirements for data consistency. In high availability mode, at least 3 Storage services and 3 Meta services are required. Graph services are stateless, so their number of replicas can be even but should be at least 2.
delete-protection
annotation cannot be deleted. For more information, see Configure deletion protection. To ensure secure communication and data integrity between the K8s API server and the admission webhook, this communication is done over HTTPS by default. This means that TLS certificates are required for the admission webhook. cert-manager is a Kubernetes certificate management controller that automates the issuance and renewal of certificates. NebulaGraph Operator uses cert-manager to manage certificates.
Once cert-manager is installed and admission control is enabled, NebulaGraph Operator will automatically create an Issuer for issuing the necessary certificate for the admission webhook, and a Certificate for storing the issued certificate. The issued certificate is stored in the nebula-operator-webhook-secret
Secret.
Install cert-manager.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.1/cert-manager.yaml\n
It is suggested to deploy the latest version of cert-manager. For details, see the official cert-manager documentation.
Modify the NebulaGraph Operator configuration file to enable admission control. Admission control is disabled by default and needs to be enabled manually.
# Check the current configuration\nhelm show values nebula-operator/nebula-operator\n
# Modify the configuration by setting `enableAdmissionWebhook` to `true`.\nhelm upgrade nebula-operator nebula-operator/nebula-operator --set enableAdmissionWebhook=true\n
Note
nebula-operator
is the name of the chart repository, and nebula-operator/nebula-operator
is the chart name. If the chart's namespace is not specified, it defaults to default
.
View the certificate Secret for the admission webhook.
kubectl get secret nebula-operator-webhook-secret -o yaml\n
If the output includes certificate contents, it means that the admission webhook's certificate has been successfully created.
Verify the control rules.
Verify preventing additional PVs from being added to Storage service.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"2Gi\"}}, \"storageClassName\": \"local-path\"},{\"resources\": {\"requests\": {\"storage\": \"3Gi\"}}, \"storageClassName\": \"fask-disks\"}]}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" deniedthe request: spec.storaged.dataVolumeClaims: Forbidden: storaged dataVolumeClaims is immutable\n
Verify disallowing shrinking Storage service's PVC capacity.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"dataVolumeClaims\":[{\"resources\": {\"requests\": {\"storage\": \"1Gi\"}}, \"storageClassName\": \"fast-disks\"}]}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: spec.storaged.dataVolumeClaims: Invalid value: resource.Quantity{i:resource.int64Amount{value:1073741824, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:\"1Gi\", Format:\"BinarySI\"}: data volume size can only be increased\n
Verify disallowing any secondary operation during Storage service scale-in.
$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"replicas\": 5}}}'\nnebulacluster.apps.nebula-graph.io/nebula patched\n$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"storaged\": {\"replicas\": 3}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: [spec.storaged: Forbidden: field is immutable while in ScaleOut phase, spec.storaged.replicas: Invalid value: 3: field is immutable while not in Running phase]\n
Verify the minimum number of replicas in high availability mode.
# Annotate the cluster to enable high availability mode.\n$ kubectl annotate nc nebula nebula-graph.io/ha-mode=true\n# Verify the minimum number of the Graph service's replicas.\n$ kubectl patch nc nebula --type='merge' --patch '{\"spec\": {\"graphd\": {\"replicas\":1}}}'\nError from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: spec.graphd.replicas: Invalid value: 1: should be at least 2 in HA mode\n
NebulaGraph Operator supports deletion protection to prevent NebulaGraph clusters from being deleted by accident. This topic describes how to configure deletion protection for a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#prerequisites","title":"Prerequisites","text":"Add the delete-protection
annotation to the cluster.
kubectl annotate nc nebula -n nebula-test nebula-graph.io/delete-protection=true\n
The preceding command enables deletion protection for the nebula
cluster in the nebula-test
namespace."},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#verify_deletion_protection","title":"Verify deletion protection","text":"To verify that deletion protection is enabled, run the following command:
kubectl delete nc nebula -n nebula-test\n
The preceding command attempts to delete the nebula
cluster in the nebula-test
namespace.
Return:
Error from server: admission webhook \"nebulaclustervalidating.nebula-graph.io\" denied the request: metadata.annotations[nebula-graph.io/delete-protection]: Forbidden: protected cluster cannot be deleted\n
"},{"location":"k8s-operator/4.cluster-administration/4.7.security/4.7.3.config-deletion-protection/#remove_the_annotation_to_disable_deletion_protection","title":"Remove the annotation to disable deletion protection","text":"Remove the delete-protection
annotation from the cluster as follows:
kubectl annotate nc nebula -n nebula-test nebula-graph.io/delete-protection-\n
The preceding command disables deletion protection for the nebula
cluster in the nebula-test
namespace.
NebulaGraph Operator calls the interface provided by NebulaGraph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a NebulaGraph cluster stops running), NebulaGraph Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.8.ha-and-balancing/4.8.1.self-healing/#prerequisites","title":"Prerequisites","text":"Install NebulaGraph Operator
"},{"location":"k8s-operator/4.cluster-administration/4.8.ha-and-balancing/4.8.1.self-healing/#steps","title":"Steps","text":"Create a NebulaGraph cluster. For more information, see Create a NebulaGraph clusters.
Delete the Pod named <cluster_name>-storaged-2
after all pods are in the Running
status.
kubectl delete pod <cluster-name>-storaged-2 --now\n
<cluster_name>
is the name of your NebulaGraph cluster. NebulaGraph Operator automates the creation of the Pod named <cluster-name>-storaged-2
to perform self-healing.
Run the kubectl get pods
command to check the status of the Pod <cluster-name>-storaged-2
.
...\nnebula-cluster-storaged-1 1/1 Running 0 5d23h\nnebula-cluster-storaged-2 0/1 ContainerCreating 0 1s\n...\n
...\nnebula-cluster-storaged-1 1/1 Running 0 5d23h\nnebula-cluster-storaged-2 1/1 Running 0 4m2s\n...\n
When the status of <cluster-name>-storaged-2
is changed from ContainerCreating
to Running
, the self-healing is performed successfully. NebulaGraph clusters use a distributed architecture to divide data into multiple logical partitions, which are typically evenly distributed across different nodes. In distributed systems, there are usually multiple replicas of the same data. To ensure the consistency of data across multiple replicas, NebulaGraph clusters use the Raft protocol to synchronize multiple partition replicas. In the Raft protocol, each partition elects a leader replica, which is responsible for handling write requests, while follower replicas handle read requests.
When a NebulaGraph cluster created by NebulaGraph Operator performs a rolling update, a storage node temporarily stops providing services for the update. For an overview of rolling updates, see Performing a Rolling Update. If the node hosting the leader replica stops providing services, it will result in the unavailability of read and write operations for that partition. To avoid this situation, by default, NebulaGraph Operator transfers the leader replicas to other unaffected nodes during the rolling update process of a NebulaGraph cluster. This way, when a storage node is being updated, the leader replicas on other nodes can continue processing client requests, ensuring the read and write availability of the cluster.
The process of migrating all leader replicas from one storage node to the other nodes may take a long time. To better control the rolling update duration, Operator provides a field called enableForceUpdate
. When it is confirmed that there is no external access traffic, you can set this field to true
. This way, the leader replicas will not be transferred to other nodes, thereby speeding up the rolling update process.
Operator triggers a rolling update of the NebulaGraph cluster under the following circumstances:
In the YAML file for creating a cluster instance, add the spec.storaged.enableForceUpdate
field and set it to true
or false
to control the rolling update speed.
When enableForceUpdate
is set to true
, it means that the leader partition replicas are not transferred, thus speeding up the rolling update process. Conversely, when set to false
, it means that the leader replicas are transferred to other nodes to ensure the read and write availability of the cluster. The default value is false
.
Warning
When setting enableForceUpdate
to true
, make sure there is no traffic entering the cluster for read and write operations. This is because this setting will force the cluster pods to be rebuilt, and during this process, data loss or client request failures may occur.
Configuration example:
...\nspec:\n...\n storaged:\n # When set to true,\n # it means that the leader partition replicas are not transferred,\n # but the cluster pods are rebuilt directly.\n enableForceUpdate: true \n ...\n
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/","title":"Restart service Pods in a NebulaGraph cluster on K8s","text":"Note
Restarting NebulaGraph cluster service Pods is a feature in the Alpha version.
During routine maintenance, it might be necessary to restart a specific service Pod in the NebulaGraph cluster, for instance, when the Pod's status is abnormal or to enforce a restart. Restarting a Pod essentially means restarting the service process. To ensure high availability, NebulaGraph Operator supports gracefully restarting all Pods of the Graph, Meta, or Storage service respectively and gracefully restarting an individual Pod of the Storage service.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#prerequisites","title":"Prerequisites","text":"A NebulaGraph cluster is created in a K8s environment. For details, see Create a NebulaGraph cluster.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#restart_all_pods_of_a_certain_service_type","title":"Restart all Pods of a certain service type","text":"To gracefully roll restart all Pods of a certain service type in the cluster, you can add an annotation (nebula-graph.io/restart-timestamp
) with the current time to the configuration of the StatefulSet controller of the corresponding service.
When NebulaGraph Operator detects that the StatefulSet controller of the corresponding service has the annotation nebula-graph.io/restart-timestamp
and its value is changed, it triggers the graceful rolling restart operation for all Pods of that service type in the cluster.
In the following example, the annotation is added for all Graph services so that all Pods of these Graph services are restarted one by one.
Assume that the cluster name is nebula
and the cluster resources are in the default
namespace. Run the following command:
Check the name of the StatefulSet controller.
kubectl get statefulset \n
Sample output:
NAME READY AGE\nnebula-graphd 2/2 33s\nnebula-metad 3/3 69s\nnebula-storaged 3/3 69s\n
Get the current timestamp.
date -u +%s\n
Example output:
1700547115\n
Overwrite the timestamp annotation of the StatefulSet controller to trigger the graceful rolling restart operation.
kubectl annotate statefulset nebula-graphd nebula-graph.io/restart-timestamp=\"1700547115\" --overwrite\n
Example output:
statefulset.apps/nebula-graphd annotate\n
Observe the restart process.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=graphd -w\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-graphd-0 1/1 Running 0 9m37s\nnebula-graphd-1 0/1 Running 0 17s\nnebula-graphd-1 1/1 Running 0 20s\nnebula-graphd-0 1/1 Terminating 0 9m40s\nnebula-graphd-0 0/1 Terminating 0 9m41s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Terminating 0 9m42s\nnebula-graphd-0 0/1 Pending 0 0s\nnebula-graphd-0 0/1 Pending 0 0s\nnebula-graphd-0 0/1 ContainerCreating 0 0s\nnebula-graphd-0 0/1 Running 0 2s\n
This above output shows the status of Graph service Pods during the restart process.
Verify that the StatefulSet controller annotation is updated.
kubectl get statefulset nebula-graphd -o yaml | grep \"nebula-graph.io/restart-timestamp\"\n
Example output:
nebula-graph.io/last-applied-configuration: '{\"persistentVolumeClaimRetentionPolicy\":{\"whenDeleted\":\"Retain\",\"whenScaled\":\"Retain\"},\"podManagementPolicy\":\"Parallel\",\"replicas\":2,\"revisionHistoryLimit\":10,\"selector\":{\"matchLabels\":{\"app.kubernetes.io/cluster\":\"nebula\",\"app.kubernetes.io/component\":\"graphd\",\"app.kubernetes.io/managed-by\":\"nebula-operator\",\"app.kubernetes.io/name\":\"nebula-graph\"}},\"serviceName\":\"nebula-graphd-headless\",\"template\":{\"metadata\":{\"annotations\":{\"nebula-graph.io/cm-hash\":\"7c55c0e5ac74e85f\",\"nebula-graph.io/restart-timestamp\":\"1700547815\"},\"creationTimestamp\":null,\"labels\":{\"app.kubernetes.io/cluster\":\"nebula\",\"app.kubernetes.io/component\":\"graphd\",\"app.kubernetes.io/managed-by\":\"nebula-operator\",\"app.kubernetes.io/name\":\"nebula-graph\"}},\"spec\":{\"containers\":[{\"command\":[\"/bin/sh\",\"-ecx\",\"exec\nnebula-graph.io/restart-timestamp: \"1700547115\"\n nebula-graph.io/restart-timestamp: \"1700547815\" \n
The above output indicates that the annotation of the StatefulSet controller has been updated, and all graph service Pods has been restarted.
"},{"location":"k8s-operator/4.cluster-administration/4.9.advanced/4.9.2.restart-cluster/#restart_a_single_storage_service_pod","title":"Restart a single Storage service Pod","text":"To gracefully roll restart a single Storage service Pod, you can add an annotation (nebula-graph.io/restart-ordinal
) with the value set to the ordinal number of the Storage service Pod you want to restart. This triggers a graceful restart or state transition for that specific Storage service Pod. The added annotation will be automatically removed after the Storage service Pod is restarted.
In the following example, the annotation is added for the Pod with ordinal number 1
, indicating a graceful restart for the nebula-storaged-1
Storage service Pod.
Assume that the cluster name is nebula
, and the cluster resources are in the default
namespace. Run the following commands:
Check the name of the StatefulSet controller.
kubectl get statefulset \n
Example output:
NAME READY AGE\nnebula-graphd 2/2 33s\nnebula-metad 3/3 69s\nnebula-storaged 3/3 69s\n
Get the ordinal number of the Storage service Pod.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=storaged\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 13h\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-2 1/1 Running 0 13h\nnebula-storaged-3 1/1 Running 0 13h\nnebula-storaged-4 1/1 Running 0 13h\nnebula-storaged-5 1/1 Running 0 13h\nnebula-storaged-6 1/1 Running 0 13h\nnebula-storaged-7 1/1 Running 0 13h\nnebula-storaged-8 1/1 Running 0 13h\n
Add the annotation for the nebula-storaged-1
Pod to trigger a graceful restart for that specific Pod.
kubectl annotate statefulset nebula-storaged nebula-graph.io/restart-ordinal=\"1\" \n
Example output:
statefulset.apps/nebula-storaged annotate\n
Observe the restart process.
kubectl get pods -l app.kubernetes.io/cluster=nebula,app.kubernetes.io/component=storaged -w\n
Example output:
NAME READY STATUS RESTARTS AGE\nnebula-storaged-0 1/1 Running 0 13h\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-2 1/1 Running 0 13h\nnebula-storaged-3 1/1 Running 0 13h\nnebula-storaged-4 1/1 Running 0 13h\nnebula-storaged-5 1/1 Running 0 12h\nnebula-storaged-6 1/1 Running 0 12h\nnebula-storaged-7 1/1 Running 0 12h\nnebula-storaged-8 1/1 Running 0 12h\n\n\nnebula-storaged-1 1/1 Running 0 13h\nnebula-storaged-1 1/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Terminating 0 13h\nnebula-storaged-1 0/1 Pending 0 0s\nnebula-storaged-1 0/1 Pending 0 0s\nnebula-storaged-1 0/1 ContainerCreating 0 0s\nnebula-storaged-1 0/1 Running 0 1s\nnebula-storaged-1 1/1 Running 0 10s \n
The above output indicates that the nebula-storaged-1
Storage service Pod is successfully restarted.
After restarting a single Storage service Pod, the distribution of storage leader replicas may become unbalanced. You can execute the BALANCE LEADER
command to rebalance the distribution of leader replicas. For information about how to view the leader distribution, see SHOW HOSTS
.
Before using NebulaGraph Cloud, you need to create a subscription on Azure. This topic describes how to create a subscription on Azure Marketplace.
"},{"location":"nebula-cloud/2.how-to-create-subsciption/#subscription_workflow","title":"Subscription workflow","text":"Enter the Azure Marketplace, and search for NebulaGraph Cloud in the search bar in Marketplace, or directly click NebulaGraph Cloud to enter the subscription page. [TODO]
Select a plan according to your own needs and click Set up + subscribe.
On the Basics page of Subscribe NebulaGraph Cloud, fill in the following plan details:
Project details
Field Description Subscription Select a subscription. Resource group Select an existing resource group or create a new one.SaaS details
Field Description Name Create a name for this SaaS subscription to easily identify it later. Recurring billingOn
or Off
. At the bottom of the Basics page, click Next: Tags.
After the subscription is completed, you need to click Open the SaaS account on the publisher's website
to create and configure your Solution. For details, see How to configure a Solution.
Solution refers to the NebulaGraph database running on NebulaGraph Cloud. After subscribing NebulaGraph Cloud on Azure, you need to configure your Solutions on the Cloud platform to complete the purchase. This topic describes how to configure a Solution.
"},{"location":"nebula-cloud/3.how-to-set-solution/#configuration_workflow","title":"Configuration workflow","text":"Log in to the Azure account that has subscribed the Solution service in NebulaGraph Cloud.
Select a region in the Provider section.
Caution
The region of the database you select should be in the same area as that of your business to avoid performance and speed problems.
Configure the type and the number of the query engine and the type, the number, and the disk size of the storage engine in the Instance section.
Caution
It is recommended to configure at least 2 query engines and 3 storage engines to ensure high service availability.
Enter the specified Azure account mailbox as the Root user in the NebulaGraph section.
Click Next at the bottom of this page.
For now, you have completed the configuration of the Solution. If the status of the Solution is running on the Cloud homepage, the Solution has been created successfully.
You may see the following status and corresponding description on the Solution page.
Status Description creating The resources required by a Solution are ready and the Solution will be created automatically. At this time, the Solution is in the creating state, which may last from several minutes to over ten minutes. starting After you have restarted a Solution, it will be in the starting state for a while. stopping After you have clicked Stop Solution, the Solution will be in the stopping state for a while. deleting After you have clicked Delete Solution, the Solution will be in the deleting state for a while. running After you create a Solution, it will be in the running state for a long time. stopped After you stop a Solution, it will be in the stopped state for a long time. deleted After you delete a Solution, it will be in the deleted state for a long time. create_failed If you failed to create a Solution, the Solution will be in the create_failed state for a long time. stop_failed If you failed to stop a Solution, the Solution will be in the stop_failed state for a long time. start_failed If you failed to start a Solution, the Solution will be in the start_failed state for a long time.Caution
If a Solution stays in an intermediate state for a long time and the page remains unchanged after refreshing, it means that there is an exception and you need to submit an order to solve the problem.
Caution
If a Solution is in the state of create_failed, stop_failed, or start_failed, you can execute CREATE, STOP, or START again.
"},{"location":"nebula-cloud/4.user-role-description/","title":"Cloud Solution roles","text":"After creating a Solution, you need to confirm the role privileges in the Cloud platform. This topic introduces the role privileges in the Cloud Solution.
"},{"location":"nebula-cloud/4.user-role-description/#built-in_roles","title":"Built-in roles","text":"NebulaGraph Cloud has multiple built-in roles:
On the Solution page, users with different roles will see different sidebars. The following describes the privileges of each role. Among them, Y means that this role can view this page, and N means that it cannot.
Page OWNER ROOT USER Solution Info Y Y Y Applications Y Y Y Connectivity Y N N Root Management Y N N User Management N Y N Audit Log Y N N Settings Y N N Subscribe Settings Y N N Billing Y N N"},{"location":"nebula-cloud/7.terms-and-conditions/","title":"Terms of Service","text":"These terms and conditions (\"Agreement\") sets forth the general terms and conditions of your use of the https://cloud.nebula-cloud.io website (\"Website\" or \"Service\") and any of its related products and services (collectively, \"Services\"). This Agreement is legally binding between you (\"User\", \"you\" or \"your\") and vesoft inc. (\"vesoft inc.\", \"we\", \"us\" or \"our\"). By accessing and using the Website and Services, you acknowledge that you have read, understood, and agree to be bound by the terms of this Agreement. If you are entering into this Agreement on behalf of a business or other legal entity, you represent that you have the authority to bind such entity to this Agreement, in which case the terms \"User\", \"you\" or \"your\" shall refer to such entity. If you do not have such authority, or if you do not agree with the terms of this Agreement, you must not accept this Agreement and may not access and use the Website and Services. You acknowledge that this Agreement is a contract between you and vesoft inc., even though it is electronic and is not physically signed by you, and it governs your use of the Website and Services.
"},{"location":"nebula-cloud/7.terms-and-conditions/#accounts","title":"Accounts","text":"You give NebulaGraph Cloud permission to use your Azure account as your NebulaGraph Cloud account and get your account information so that NebulaGraph Cloud can contact you regarding this product and related products. You understand that the rights to use NebulaGraph Cloud come from vesoft instead of Microsoft, vesoft is the provider of this product. Use of NebulaGraph Cloud is governed by provider's terms of service, service-level agreement, and privacy policy.
"},{"location":"nebula-cloud/7.terms-and-conditions/#billing_and_payments","title":"Billing and payments","text":"Microsoft collects payments from you for your commercial marketplace purchases. You may pay the fees for the NebulaGraph Cloud services according to your chosen solutions. You shall pay all fees or charges to your account in accordance with the fees, charges, and billing terms in effect at the time a fee or charge is due and payable.
"},{"location":"nebula-cloud/7.terms-and-conditions/#accuracy_of_information","title":"Accuracy of information","text":"Occasionally there may be information on the Website that contains typographical errors, inaccuracies or omissions that may relate to pricing, availability, promotions and offers. We reserve the right to correct any errors, inaccuracies or omissions, and to change or update information or cancel orders if any information on the Website or Services is inaccurate at any time without prior notice (including after you have submitted your order). We undertake no obligation to update, amend or clarify information on the Website including, without limitation, pricing information, except as required by law. No specified update or refresh date applied on the Website should be taken to indicate that all information on the Website or Services has been modified or updated.
"},{"location":"nebula-cloud/7.terms-and-conditions/#data_and_content_protection","title":"Data and content protection","text":"vesoft understands and recognizes that all the data processed, stored, uploaded, downloaded, distributed, or processed through services provided by NebulaGraph Cloud is your data or content, and you fully own your data and content. Except for the implementation of your service requirements, no unauthorized use or disclosure of your data or content will be made except in the following circumstances:
a.vesoft may disclose the data or content in any legal proceeding or to a governmental body as required by Law;
b.an agreement made between you and vesoft.
You can delete or edit your data or content yourself. If you have deleted the service or data, vesoft will delete your data and will no longer retain such data in accordance with your instructions. You should operate carefully with regard to operations such as deletion or modification.
You understand and agree: when your subscription is in the Suspended state, Microsoft gives the customer a 30-day grace period before automatically canceling the subscription. After the 30-day grace period is over, the webhook will receive an Unsubscribe action. After vesoft receives a cancellation webhook call, vesoft will only continue to store your data and content (if any) within 7 days. After 7 days, vesoft will delete all your data and content, including all cached or backup copies, and will no longer retain any of them.
Once the data or content is deleted, it cannot be restored; you shall take responsibilities caused by the data being deleted. You understand and agree that vesoft has no obligation to continue to retain, export or return your data or content.
"},{"location":"nebula-cloud/7.terms-and-conditions/#links_to_other_resources","title":"Links to other resources","text":"Although the Website and Services may link to other resources (such as websites), we are not, directly or indirectly, implying any approval, association, sponsorship, endorsement, or affiliation with any linked resource, unless specifically stated herein. You acknowledge that vesoft inc. is providing these links to you only as a convenience. We are not responsible for examining or evaluating, and we do not warrant the offerings of, any businesses or individuals or the content of their resources. We do not assume any responsibility or liability for the actions, products, services, and content of any other third parties. You should carefully review the legal statements and other conditions of use of any resource which you access through a link on the Website and Services. Your linking to any other off-site resources is at your own risk.
"},{"location":"nebula-cloud/7.terms-and-conditions/#prohibited_uses","title":"Prohibited uses","text":"In addition to other terms as set forth in the Agreement, you are prohibited from using the Website and Services or Content: (a) for any unlawful purpose; (b) to solicit others to perform or participate in any unlawful acts; (c) to violate any international, federal, provincial or state regulations, rules, laws, or local ordinances; (d) to infringe upon or violate our intellectual property rights or the intellectual property rights of others; (e) to harass, abuse, insult, harm, defame, slander, disparage, intimidate, or discriminate based on gender, sexual orientation, religion, ethnicity, race, age, national origin, or disability; (f) to submit false or misleading information; (g) to upload or transmit viruses or any other type of malicious code that will or may be used in any way that will affect the functionality or operation of the Website and Services, third party products and services, or the Internet; (h) to spam, phish, pharm, pretext, spider, crawl, or scrape; (i) for any obscene or immoral purpose; or (j) to interfere with or circumvent the security features of the Website and Services, third party products and services, or the Internet. We reserve the right to terminate your use of the Website and Services for violating any of the prohibited uses.
"},{"location":"nebula-cloud/7.terms-and-conditions/#intellectual_property_rights","title":"Intellectual property rights","text":"\"Intellectual Property Rights\" means all present and future rights conferred by law or statute in or in relation to any copyright and related rights, trademarks, designs, patents, inventions, goodwill and the right to sue for passing off, rights to inventions, rights to use, and all other intellectual property rights, in each case whether registered or unregistered and including all applications and rights to apply for and be granted, rights to claim priority from, such rights and all similar or equivalent rights or forms of protection and any other results of intellectual activity which subsist or will subsist now or in the future in any part of the world. This Agreement does not transfer to you any intellectual property owned by vesoft inc. or third parties, and all rights, titles, and interests in and to such property will remain (as between the parties) solely with vesoft inc. All trademarks, service marks, graphics and logos used in connection with the Website and Services, are trademarks or registered trademarks of vesoft inc. or its licensors. Other trademarks, service marks, graphics and logos used in connection with the Website and Services may be the trademarks of other third parties. Your use of the Website and Services grants you no right or license to reproduce or otherwise use any of vesoft inc. or third party trademarks.
"},{"location":"nebula-cloud/7.terms-and-conditions/#disclaimer_of_warranty","title":"Disclaimer of warranty","text":"You agree that such Service is provided on an \"as is\" and \"as available\" basis and that your use of the Website and Services is solely at your own risk. We expressly disclaim all warranties of any kind, whether express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose and non-infringement. We make no warranty that the Services will meet your requirements, or that the Service will be uninterrupted, timely, secure, or error-free; nor do we make any warranty as to the results that may be obtained from the use of the Service or as to the accuracy or reliability of any information obtained through the Service or that defects in the Service will be corrected. You understand and agree that any material and/or data downloaded or otherwise obtained through the use of Service is done at your own discretion and risk and that you will be solely responsible for any damage or loss of data that results from the download of such material and/or data. We make no warranty regarding any goods or services purchased or obtained through the Service or any transactions entered into through the Service. No advice or information, whether oral or written, obtained by you from us or through the Service shall create any warranty not expressly made herein.
"},{"location":"nebula-cloud/7.terms-and-conditions/#limitation_of_liability","title":"Limitation of liability","text":"To the fullest extent permitted by applicable law, in no event will vesoft inc., its affiliates, directors, officers, employees, agents, suppliers or licensors be liable to any person for any indirect, incidental, special, punitive, cover or consequential damages (including, without limitation, damages for lost profits, revenue, sales, goodwill, use of content, impact on business, business interruption, loss of anticipated savings, loss of business opportunity) however caused, under any theory of liability, including, without limitation, contract, tort, warranty, breach of statutory duty, negligence or otherwise, even if the liable party has been advised as to the possibility of such damages or could have foreseen such damages.
"},{"location":"nebula-cloud/7.terms-and-conditions/#indemnification","title":"Indemnification","text":"You agree to indemnify and hold vesoft inc. and its affiliates, directors, officers, employees, agents, suppliers and licensors harmless from and against any liabilities, losses, damages or costs, including reasonable attorneys' fees, incurred in connection with or arising from any third party allegations, claims, actions, disputes, or demands asserted against any of them as a result of or relating to your Content, your use of the Website and Services or any willful misconduct on your part.
"},{"location":"nebula-cloud/7.terms-and-conditions/#severability","title":"Severability","text":"All rights and restrictions contained in this Agreement may be exercised and shall be applicable and binding only to the extent that they do not violate any applicable laws and are intended to be limited to the extent necessary so that they will not render this Agreement illegal, invalid or unenforceable. If any provision or portion of any provision of this Agreement shall be held to be illegal, invalid or unenforceable by a court of competent jurisdiction, it is the intention of the parties that the remaining provisions or portions thereof shall constitute their agreement with respect to the subject matter hereof, and all such remaining provisions or portions thereof shall remain in full force and effect.
"},{"location":"nebula-cloud/7.terms-and-conditions/#dispute_resolution","title":"Dispute resolution","text":"The formation, interpretation, and performance of this Agreement and any disputes arising out of it shall be governed by the substantive and procedural laws of China without regard to its rules on conflicts or choice of law and, to the extent applicable, the laws of China. You further consent to the territorial jurisdiction of and exclusive venue in Internet Court of Hangzhou as the legal forum for any such dispute. You hereby waive any right to a jury trial in any proceeding arising out of or related to this Agreement. The United Nations Convention on Contracts for the International Sale of Goods does not apply to this Agreement.
"},{"location":"nebula-cloud/7.terms-and-conditions/#assignment","title":"Assignment","text":"You may not assign, resell, sub-license or otherwise transfer or delegate any of your rights or obligations hereunder, in whole or in part, without our prior written consent, which consent shall be at our own sole discretion and without obligation; any such assignment or transfer shall be null and void. We are free to assign any of its rights or obligations hereunder, in whole or in part, to any third party as part of the sale of all or substantially all of its assets or stock or as part of a merger.
"},{"location":"nebula-cloud/7.terms-and-conditions/#changes_and_amendments","title":"Changes and amendments","text":"We reserve the right to modify this Agreement or its terms relating to the Website and Services at any time, effective upon posting of an updated version of this Agreement on the Website. When we do, we will revise the updated date at the bottom of this page. Continued use of the Website and Services after any such changes shall constitute your consent to such changes.
"},{"location":"nebula-cloud/7.terms-and-conditions/#acceptance_of_these_terms","title":"Acceptance of these terms","text":"You acknowledge that you have read this Agreement and agree to all its terms and conditions. By accessing and using the Website and Services you agree to be bound by this Agreement. If you do not agree to abide by the terms of this Agreement, you are not authorized to access or use the Website and Services.
"},{"location":"nebula-cloud/7.terms-and-conditions/#contacting_us","title":"Contacting us","text":"If you would like to contact us to understand more about this Agreement or wish to contact us concerning any matter relating to it, you may send an email to legal@vesoft.com
This document was last updated on December 14, 2021
"},{"location":"nebula-cloud/8.privacy-policy/","title":"Privacy Policy","text":"This privacy policy (\"Policy\") describes how the personally identifiable information (\"Personal Information\") you may provide on the https://www.nebula-cloud.io[TODO] website (\"Website\" or \"Service\") and any of its related products and services (collectively, \"Services\") is collected, protected and used. It also describes the choices available to you regarding our use of your Personal Information and how you can access and update this information. This Policy is a legally binding agreement between you (\"User\", \"you\" or \"your\") and vesoft Inc. (\"vesoft Inc.\", \"we\", \"us\" or \"our\"). By accessing and using the Website and Services, you acknowledge that you have read, understood, and agree to be bound by the terms of this Agreement. This Policy does not apply to the practices of companies that we do not own or control, or to individuals that we do not employ or manage.
"},{"location":"nebula-cloud/8.privacy-policy/#automatic_collection_of_information","title":"Automatic collection of information","text":"When you open the Website, our servers automatically record information that your browser sends. This data may include information such as your device's IP address, browser type and version, operating system type and version, language preferences or the webpage you were visiting before you came to the Website and Services, pages of the Website and Services that you visit, the time spent on those pages, information you search for on the Website, access times and dates, and other statistics.
Information collected automatically is used only to identify potential cases of abuse and establish statistical information regarding the usage and traffic of the Website and Services. This statistical information is not otherwise aggregated in such a way that would identify any particular user of the system.
"},{"location":"nebula-cloud/8.privacy-policy/#collection_of_personal_information","title":"Collection of personal information","text":"You can access and use the Website and Services without telling us who you are or revealing any information by which someone could identify you as a specific, identifiable individual. If, however, you wish to use some of the features on the Website, you may be asked to provide certain Personal Information (for example, your name and e-mail address). We receive and store any information you knowingly provide to us when you create an account or fill any online forms on the Website. When required, this information may include the following:
You can choose not to provide us with your Personal Information, but then you may not be able to take advantage of some of the features on the Website. Users who are uncertain about what information is mandatory are welcome to contact us.
"},{"location":"nebula-cloud/8.privacy-policy/#use_and_processing_of_collected_information","title":"Use and processing of collected information","text":"In order to make the Website and Services available to you, or to meet a legal obligation, we need to collect and use certain Personal Information. If you do not provide the information that we request, we may not be able to provide you with the requested products or services. Some of the information we collect is directly from you via the Website and Services. However, we may also collect Personal Information about you from other sources. Any of the information we collect from you may be used for the following purposes:
Processing your Personal Information depends on how you interact with the Website and Services, where you are located in the world and if one of the following applies: (i) you have given your consent for one or more specific purposes; (ii) provision of information is necessary for the performance of an agreement with you and/or for any pre-contractual obligations thereof; (iii) processing is necessary for compliance with a legal obligation to which you are subject; (iv) processing is related to a task that is carried out in the public interest or in the exercise of official authority vested in us; (v) processing is necessary for the purposes of the legitimate interests pursued by us or by a third party.
Note that under some legislation we may be allowed to process information until you object to such processing (by opting out), without having to rely on consent or any other of the following legal bases below. In any case, we will be happy to clarify the specific legal basis that applies to the processing, and in particular whether the provision of Personal Information is a statutory or contractual requirement, or a requirement necessary to enter into a contract.
"},{"location":"nebula-cloud/8.privacy-policy/#managing_information","title":"Managing information","text":"You are able to delete certain Personal Information we have about you. The Personal Information you can delete may change as the Website and Services change. If you would like to delete your Personal Information or permanently delete your account, you can do so by contacting us.
"},{"location":"nebula-cloud/8.privacy-policy/#disclosure_of_information","title":"Disclosure of information","text":"Depending on the requested Services or as necessary to complete any transaction or provide any service you have requested, we may share your information with your consent with our trusted third parties that work with us, any other affiliates and subsidiaries we rely upon to assist in the operation of the Website and Services available to you. We do not share Personal Information with unaffiliated third parties. These service providers are not authorized to use or disclose your information except as necessary to perform services on our behalf or comply with legal requirements. We may share your Personal Information for these purposes only with third parties whose privacy policies are consistent with ours or who agree to abide by our policies with respect to Personal Information. These third parties are given Personal Information they need only in order to perform their designated functions, and we do not authorize them to use or disclose Personal Information for their own marketing or other purposes.
We will disclose any Personal Information we collect, use or receive if required or permitted by law, such as to comply with a subpoena, or similar legal process, and when we believe in good faith that disclosure is necessary to protect our rights, protect your safety or the safety of others, investigate fraud, or respond to a government request.
In the event we go through a business transition, such as a merger or acquisition by another company, or sale of all or a portion of its assets, your user account, and Personal Information will likely be among the assets transferred.
"},{"location":"nebula-cloud/8.privacy-policy/#retention_of_information","title":"Retention of information","text":"We will retain and use your Personal Information for the period necessary to comply with our legal obligations, resolve disputes, and enforce our agreements unless a longer retention period is required or permitted by law. We may use any aggregated data derived from or incorporating your Personal Information after you update or delete it, but not in a manner that would identify you personally. Once the retention period expires, Personal Information shall be deleted. Therefore, the right to access, the right to erasure, the right to rectification and the right to data portability cannot be enforced after the expiration of the retention period.
"},{"location":"nebula-cloud/8.privacy-policy/#transfer_of_information","title":"Transfer of information","text":"Depending on your location, data transfers may involve transferring and storing your information in a country other than your own. You are entitled to learn about the legal basis of information transfers to a country outside the European Union or to any international organization governed by public international law or set up by two or more countries, such as the UN, and about the security measures taken by us to safeguard your information. If any such transfer takes place, you can find out more by checking the relevant sections of this Policy or inquire with us using the information provided in the contact section.
"},{"location":"nebula-cloud/8.privacy-policy/#the_rights_of_users","title":"The rights of users","text":"You may exercise certain rights regarding your information processed by us. In particular, you have the right to do the following: (i) you have the right to withdraw consent where you have previously given your consent to the processing of your information; (ii) you have the right to object to the processing of your information if the processing is carried out on a legal basis other than consent; (iii) you have the right to learn if information is being processed by us, obtain disclosure regarding certain aspects of the processing and obtain a copy of the information undergoing processing; (iv) you have the right to verify the accuracy of your information and ask for it to be updated or corrected; (v) you have the right, under certain circumstances, to restrict the processing of your information, in which case, we will not process your information for any purpose other than storing it; (vi) you have the right, under certain circumstances, to obtain the erasure of your Personal Information from us; (vii) you have the right to receive your information in a structured, commonly used and machine readable format and, if technically feasible, to have it transmitted to another controller without any hindrance. This provision is applicable provided that your information is processed by automated means and that the processing is based on your consent, on a contract which you are part of or on pre-contractual obligations thereof.
"},{"location":"nebula-cloud/8.privacy-policy/#the_right_to_object_to_processing","title":"The right to object to processing","text":"Where Personal Information is processed for the public interest, in the exercise of an official authority vested in us or for the purposes of the legitimate interests pursued by us, you may object to such processing by providing a ground related to your particular situation to justify the objection.
"},{"location":"nebula-cloud/8.privacy-policy/#how_to_exercise_these_rights","title":"How to exercise these rights","text":"Any requests to exercise your rights can be directed to vesoft Inc. through the contact details provided in this document. Please note that we may ask you to verify your identity before responding to such requests. Your request must provide sufficient information that allows us to verify that you are the person you are claiming to be or that you are the authorized representative of such person. You must include sufficient details to allow us to properly understand the request and respond to it. We cannot respond to your request or provide you with Personal Information unless we first verify your identity or authority to make such a request and confirm that the Personal Information relates to you.
"},{"location":"nebula-cloud/8.privacy-policy/#privacy_of_children","title":"Privacy of children","text":"We do not knowingly collect any Personal Information from children under the age of 18. If you are under the age of 18, please do not submit any Personal Information through the Website and Services. We encourage parents and legal guardians to monitor their children's Internet usage and to help enforce this Policy by instructing their children never to provide Personal Information through the Website and Services without their permission. If you have reason to believe that a child under the age of 18 has provided Personal Information to us through the Website and Services, please contact us. You must also be at least 16 years of age to consent to the processing of your Personal Information in your country (in some countries we may allow your parent or guardian to do so on your behalf).
"},{"location":"nebula-cloud/8.privacy-policy/#cookies","title":"Cookies","text":"The Website and Services use \"cookies\" to help personalize your online experience. A cookie is a text file that is placed on your hard disk by a web page server. Cookies cannot be used to run programs or deliver viruses to your computer. Cookies are uniquely assigned to you, and can only be read by a web server in the domain that issued the cookie to you. We may use cookies to collect, store, and track information for statistical purposes to operate the Website and Services. You have the ability to accept or decline cookies. Most web browsers automatically accept cookies, but you can usually modify your browser setting to decline cookies if you prefer. To learn more about cookies and how to manage them, visit internetcookies.org
"},{"location":"nebula-cloud/8.privacy-policy/#email_marketing","title":"Email marketing","text":"We offer electronic newsletters to which you may voluntarily subscribe at any time. We are committed to keeping your e-mail address confidential and will not disclose your email address to any third parties except as allowed in the information use and processing section. We will maintain the information sent via e-mail in accordance with applicable laws and regulations.
"},{"location":"nebula-cloud/8.privacy-policy/#links_to_other_resources","title":"Links to other resources","text":"The Website and Services contain links to other resources that are not owned or controlled by us. Such links do not constitute an endorsement by vesoft Inc. of those External Web Sites. Please be aware that we are not responsible for the privacy practices of such other resources or third parties. We encourage you to be aware when you leave the Website and Services and to read the privacy statements of each and every resource that may collect Personal Information. You should carefully review the legal statements and other conditions of use of any resource which you access through a link on the Website and Services.
"},{"location":"nebula-cloud/8.privacy-policy/#information_security","title":"Information security","text":"We secure information you provide on computer servers in a controlled, secure environment, protected from unauthorized access, use, or disclosure. We maintain reasonable administrative, technical, and physical safeguards in an effort to protect against unauthorized access, use, modification, and disclosure of Personal Information in its control and custody. However, no data transmission over the Internet or wireless network can be guaranteed. Therefore, while we strive to protect your Personal Information, you acknowledge that (i) there are security and privacy limitations of the Internet which are beyond our control; (ii) the security, integrity, and privacy of any and all information and data exchanged between you and the Website and Services cannot be guaranteed; and (iii) any such information and data may be viewed or tampered with in transit by a third party, despite best efforts.
"},{"location":"nebula-cloud/8.privacy-policy/#data_breach","title":"Data breach","text":"In the event we become aware that the security of the Website and Services has been compromised or users Personal Information has been disclosed to unrelated third parties as a result of external activity, including, but not limited to, security attacks or fraud, we reserve the right to take reasonably appropriate measures, including, but not limited to, investigation and reporting, as well as notification to and cooperation with law enforcement authorities. In the event of a data breach, we will make reasonable efforts to notify affected individuals if we believe that there is a reasonable risk of harm to the user as a result of the breach or if notice is otherwise required by law. When we do, we will post a notice on the Website, send you an email.
"},{"location":"nebula-cloud/8.privacy-policy/#changes_and_amendments","title":"Changes and amendments","text":"We reserve the right to modify this Policy or its terms relating to the Website and Services from time to time in our discretion and will notify you of any material changes to the way in which we treat Personal Information. When we do, we will revise the updated date at the bottom of this page. We may also provide notice to you in other ways in our discretion, such as through contact information you have provided. Any updated version of this Policy will be effective immediately upon the posting of the revised Policy unless otherwise specified. Your continued use of the Website and Services after the effective date of the revised Policy (or such other act specified at that time) will constitute your consent to those changes. However, we will not, without your consent, use your Personal Information in a manner materially different than what was stated at the time your Personal Information was collected.
"},{"location":"nebula-cloud/8.privacy-policy/#dispute_resolution","title":"Dispute resolution","text":"The formation, interpretation, and performance of this Agreement and any disputes arising out of it shall be governed by the substantive and procedural laws of China without regard to its rules on conflicts or choice of law and, to the extent applicable, the laws of China. You further consent to the personal jurisdiction of and exclusive venue in Yuhang District Court located in Hangzhou as the legal forum for any such dispute. You hereby waive any right to a jury trial in any proceeding arising out of or related to this Agreement. The United Nations Convention on Contracts for the International Sale of Goods does not apply to this Agreement.
"},{"location":"nebula-cloud/8.privacy-policy/#acceptance_of_this_policy","title":"Acceptance of this policy","text":"You acknowledge that you have read this Policy and agree to all its terms and conditions. By accessing and using the Website and Services you agree to be bound by this Policy. If you do not agree to abide by the terms of this Policy, you are not authorized to access or use the Website and Services.
"},{"location":"nebula-cloud/8.privacy-policy/#contacting_us","title":"Contacting us","text":"If you would like to contact us to understand more about this Policy or wish to contact us concerning any matter relating to individual rights and your Personal Information, you may send an email to legal@vesoft.com
This document was last updated on December 28, 2021
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/","title":"Solution","text":"On the Solution page, the sidebars are different based on roles and privileges. For more information, see Roles and privileges in Cloud.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#solution_info","title":"Solution Info","text":"On the homepage of Cloud, click on the Solution's name to enter the Solution Info page. The Solution Info page consists of the following parts: Basic Info, Instance Info, Price Info, Getting Started. You can view the information on this page in detail.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#applications","title":"Applications","text":"In the sidebar, click Applications to enter the page of ecosystem tools(Dashboard/Studio/Explorer). Different roles see different ecosystem tools. For more information, see Accessory applications.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#connectivity","title":"Connectivity","text":"In the sidebar, click Connectivity to enter Private Link page. On this page, you can create a Private Link endpoint that enables you to access NebulaGraph databases through a private IP address in a virtual network. For more information, see Private Link.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#root_management","title":"Root Management","text":"In the sidebar, click Root Management to enter the root account management page. For more information, see Role and User Management.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#user_management","title":"User Management","text":"In the sidebar, click User Management to enter the user account management page. For more information, see Role and User Management.
"},{"location":"nebula-cloud/5.solution/5.0.introduce-solution/#audit_log","title":"Audit Log","text":"In the sidebar, click Audit Log to enter the operation history page. You can select the time period according to the operation information such as Create Solution
, Start Solution
, Stop Solution
, and filter results by operator and operation record.
In the sidebar, click Settings to enter the settings page, and you can Stop Solution
or Transfer Solution
in this page.
NebulaGraph Cloud integrates with NebulaGraph Studio, NebulaGraph Dashboard, and NebulaGraph Explorer.
On the Applications page, ecosystem tools are different based on roles and privileges. The correspondence between different roles and privileges is as follows. The first column means the tools that the role can use, Y means the role has the corresponding privileges, and N means the role has no privileges.
Tools OWNER ROOT USER Dashboard Y Y N Studio N Y Y Explorer N Y Y"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#dashboard","title":"Dashboard","text":"NebulaGraph Dashboard (Dashboard for short) is a visualization tool that monitors and manages the status of machines and services in NebulaGraph clusters.
"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#studio","title":"Studio","text":"NebulaGraph Studio (Studio in short) is a browser-based visualization tool to manage NebulaGraph. It provides you with a graphical user interface to manipulate graph schemas, import data, explore graph data, and run nGQL statements to retrieve data. With Studio, you can quickly become a graph exploration expert from scratch. For more information, see What is NebulaGraph Studio.
"},{"location":"nebula-cloud/5.solution/5.1.supporting-application/#explorer","title":"Explorer","text":"NebulaGraph Explorer (Explorer in short) is a browser-based visualization tool. It is used with the NebulaGraph core to visualize interaction with graph data. Even without any experience in a graph database, you can quickly become a graph exploration expert.
"},{"location":"nebula-cloud/5.solution/5.2.connection-configuration-and-use/","title":"Private Link","text":"You can create a Private Link endpoint in Connectivity to allow users to access NebulaGraph databases through a private IP in a virtual network, without exposing your traffic to the public internet. For more information about Private Link, see What is Azure Private Link?.
"},{"location":"nebula-cloud/5.solution/5.2.connection-configuration-and-use/#configure_private_link","title":"Configure Private Link","text":"Enter your subscription ID, click Create. The creation time takes about 2 minutes.
Note
The subscription ID on the Subscription page of Azure Portal. You can click on the [Subscriptions] (https://portal.azure.com/?l=en.en-us#blade/Microsoft_Azure_Billing/SubscriptionsBlade) page for quick access.
After the creation, you can use Alias to connect to Azure resources and create a private endpoint in Azure.
Click + add.
In the Basics section, fill in the following plan details:
Project details
Field Description Subscription Select the subscription. Resource group Select an existing resource group or create a new resource group.Instance details
Field Description Name Set the name of the private endpoint. Region Select the region.Caution
The region of the database you select should be in the same area as that of your business to avoid performance and speed problems.
At the bottom of the Basics page, click Next: Resource.
In the Resource section, fill in the following plan details:
Field Description Connection method Click Connect to an Azure resource by resource ID or alias. Resource ID or alias Set the alias. Request message Set the message, this message will be sent to the resource owner.Note
The alias is on the Connectivity page of NebulaGraph Cloud, click to copy it.
At the bottom of the Resource page, click Next: Configuration.
In the Configuration section, select the following plan details:
Networking
Field Description Virtual network Set virtual networks. Subnet Set the subnet in the selected virtual network.Note
Private DNS integration is currently not supported.
At the bottom of the Configuration page, click Next: Tags.
(optional)In the Tags section, enter Name:Values.
At the bottom of the Tags page, click Next: Review + create.
After creating the private endpoint, copy the Private IP address in Network interface to the Connectivity page in Cloud. Click the Create.
Note
Private Link Endpoint IP information is stored in the Cloud, and you can click to modify.
You can use Private link endpoint IP to connect to NebulaGraph. For more information, see Connect to NebulaGraph.
"},{"location":"nebula-cloud/5.solution/5.3.role-and-authority-management/","title":"Roles and authority management","text":"NebulaGraph Cloud roles are different from roles in NebulaGraph. For more information, see Roles in Cloud Solution.
Roles in Cloud Roles in NebulaGraph OWNER - ROOT ROOT USER ADMIN/DBA/GUEST/USER"},{"location":"nebula-cloud/5.solution/5.3.role-and-authority-management/#root_management","title":"Root Management","text":"Only users with OWNER authority can manage ROOT users.
On the Root Management page, OWNER can reset ROOT users.
Click Reset, enter the email address of the ROOT user to be updated, and click Send Email to send the email.
After the ROOT user receives the confirmation email, click Confirm.
Only users with ROOT authority can manage USER users.
On the User Management page, the ROOT user can grant roles in graph spaces to other users. Available roles are ADMIN, DBA, GUEST, and USER.
NebulaGraph Dashboard Community Edition (Dashboard for short) is a visualization tool that monitors the status of machines and services in NebulaGraph clusters.
Enterpriseonly
Dashboard Enterprise Edition adds features such as visual cluster creation, batch import of clusters, fast scaling, etc. For more information, see Pricing.
"},{"location":"nebula-dashboard/1.what-is-dashboard/#features","title":"Features","text":"Dashboard monitors:
You can use Dashboard in one of the following scenarios:
The monitoring data will be retained for 14 days by default, that is, only the monitoring data within the last 14 days can be queried.
Note
The monitoring service is supported by Prometheus. The update frequency and retention intervals can be modified. For details, see Prometheus.
"},{"location":"nebula-dashboard/1.what-is-dashboard/#version_compatibility","title":"Version compatibility","text":"The version correspondence between NebulaGraph and Dashboard Community Edition is as follows.
NebulaGraph version Dashboard version 3.6.0 3.4.0 3.5.x 3.4.0 3.4.0 ~ 3.4.1 3.4.0\u30013.2.0 3.3.0 3.2.0 2.5.0 ~ 3.2.0 3.1.0 2.5.x ~ 3.1.0 1.1.1 2.0.1~2.5.1 1.0.2 2.0.1~2.5.1 1.0.1"},{"location":"nebula-dashboard/1.what-is-dashboard/#release_note","title":"Release note","text":"Release
"},{"location":"nebula-dashboard/2.deploy-dashboard/","title":"Deploy Dashboard Community Edition","text":"This topic will describe how to deploy NebulaGraph Dashboard in detail.
To download and compile the latest source code of Dashboard, follow the instructions on the nebula dashboard GitHub page.
"},{"location":"nebula-dashboard/2.deploy-dashboard/#prerequisites","title":"Prerequisites","text":"Before you deploy Dashboard, you must confirm that:
Before the installation starts, the following ports are not occupied.
Download the tar packagenebula-dashboard-3.4.0.x86_64.tar.gz as needed.
Run tar -xvf nebula-dashboard-3.4.0.x86_64.tar.gz
to decompress the installation package.
Modify the config.yaml
file in nebula-dashboard
.
The configuration file contains the configurations of four dependent services and configurations of clusters. The descriptions of the dependent services are as follows.
Service Default port Description nebula-http-gateway 8090 Provides HTTP ports for cluster services to execute nGQL statements to interact with the NebulaGraph database. nebula-stats-exporter 9200 Collects the performance metrics in the cluster, including the IP addresses, versions, and monitoring metrics (such as the number of queries, the latency of queries, the latency of heartbeats, and so on). node-exporter 9100 Collects the source information of nodes in the cluster, including the CPU, memory, load, disk, and network. prometheus 9090 The time series database that stores monitoring data.The descriptions of the configuration file are as follows.
port: 7003 # Web service port.\ngateway:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n port: 8090\n https: false # Whether to enable HTTPS.\n runmode: dev # Program running mode, including dev, test, and prod. It is used to distinguish between different running environments generally.\nstats-exporter:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n nebulaPort: 9200\n https: false # Whether to enable HTTPS.\nnode-exporter:\n - ip: nebulaHostIP_1 # The IP of the machine where the NebulaGraph is deployed.\n port: 9100\n https: false # Whether to enable HTTPS.\n# - ip: nebulaHostIP_2\n# port: 9100\n# https: false\nprometheus:\n ip: hostIP # The IP of the machine where the Dashboard is deployed.\n prometheusPort: 9090\n https: false # Whether to enable HTTPS.\n scrape_interval: 5s # The interval for collecting the monitoring data, which is 1 minute by default.\n evaluation_interval: 5s # The interval for running alert rules, which is 1 minute by default.\n# Cluster node info\nnebula-cluster:\n name: 'default' # Cluster name\n metad:\n - name: metad0\n endpointIP: nebulaMetadIP # The IP of the machine where the Meta service is deployed.\n port: 9559\n endpointPort: 19559\n # - name: metad1\n # endpointIP: nebulaMetadIP\n # port: 9559\n # endpointPort: 19559 \n graphd:\n - name: graphd0\n endpointIP: nebulaGraphdIP # The IP of the machine where the Graph service is deployed.\n port: 9669\n endpointPort: 19669\n # - name: graphd1\n # endpointIP: nebulaGraphdIP\n # port: 9669\n # endpointPort: 19669 \n storaged:\n - name: storaged0\n endpointIP: nebulaStoragedIP # The IP of the machine where the Storage service is deployed.\n port: 9779\n endpointPort: 19779\n # - name: storaged1\n # endpointIP: nebulaStoragedIP\n # port: 9779\n # endpointPort: 19779 \n
Run ./dashboard.service start all
to start the services.
If you are deploying Dashboard using docker, you should also modify the configuration file config.yaml
, and then run docker-compose up -d
to start the container.
Note
If you change the port number in config.yaml
, the port number in docker-compose.yaml
needs to be consistent as well.
Run docker-compose stop
to stop the container.
You can use the dashboard.service
script to start, restart, stop, and check the Dashboard services.
sudo <dashboard_path>/dashboard.service\n[-v] [-h]\n<start|restart|stop|status> <prometheus|webserver|exporter|gateway|all>\n
Parameter Description dashboard_path
Dashboard installation path. -v
Display detailed debugging information. -h
Display help information. start
Start the target services. restart
Restart the target services. stop
Stop the target services. status
Check the status of the target services. prometheus
Set the prometheus service as the target service. webserver
Set the webserver Service as the target service. exporter
Set the exporter Service as the target service. gateway
Set the gateway Service as the target service. all
Set all the Dashboard services as the target services. Note
To view the Dashboard version, run the command ./dashboard.service -version
.
Connect to Dashboard
"},{"location":"nebula-dashboard/3.connect-dashboard/","title":"Connect Dashboard","text":"After Dashboard is deployed, you can log in and use Dashboard on the browser.
"},{"location":"nebula-dashboard/3.connect-dashboard/#prerequisites","title":"Prerequisites","text":"Confirm the IP address of the machine where the Dashboard service is installed. Enter <IP>:7003
in the browser to open the login page.
Enter the username and the passwords of the NebulaGraph database.
root
as the username and random characters as the password.To enable authentication, see Authentication.
Select the NebulaGraph version to be used.
Click Login.
NebulaGraph Dashboard consists of three parts: Machine, Service, and Management. This topic will describe them in detail.
"},{"location":"nebula-dashboard/4.use-dashboard/#overview","title":"Overview","text":""},{"location":"nebula-dashboard/4.use-dashboard/#machine","title":"Machine","text":"Click Machine->Overview to enter the machine overview page.
On this page, you can view the variation of CPU, Memory, Load, Disk, and Network In/Out quickly.
To view the detailed monitoring information, click the button. In this example, select Load
for details. The figure is as follows.
Click Service->Overview to enter the service overview page.
On this page, you can view the information of Graph, Meta, and Storage services quickly. In the upper right corner, the number of normal services and abnormal services will be displayed.
Note
In the Service page, only two monitoring metrics can be set for each service, which can be adjusted by clicking the Set up button.
To view the detailed monitoring information, click the button. In this example, select Graph
for details. The figure is as follows.
Note
Before using graph space metrics, you need to set enable_space_level_metrics
to true
in the Graph service. For details, see [Graph Service configuration](../5.configurations-and-logs/1.configurations/3.graph-config.md.
Space-level metric incompatibility
If a graph space name contains special characters, the corresponding metric data of that graph space may not be displayed.
The service monitoring page can also monitor graph space level metrics. Only when the behavior of a graph space metric is triggered, you can specify the graph space to view information about the corresponding graph space metric.
Space graph metrics record the information of different graph spaces separately. Currently, only the Graph service supports a set of space-level metrics.
For information about the space graph metrics, see Graph space.
"},{"location":"nebula-dashboard/4.use-dashboard/#management","title":"Management","text":""},{"location":"nebula-dashboard/4.use-dashboard/#overview_info","title":"Overview info","text":"On the Overview Info page, you can see the information of the NebulaGraph cluster, including Storage leader distribution, Storage service details, versions and hosts information of each NebulaGraph service, and partition distribution and details.
"},{"location":"nebula-dashboard/4.use-dashboard/#storage_leader_distribution","title":"Storage Leader Distribution","text":"In this section, the number of Leaders and the Leader distribution will be shown.
In this section, the version and host information of each NebulaGraph service will be shown. Click Detail in the upper right corner to view the details of the version and host information.
"},{"location":"nebula-dashboard/4.use-dashboard/#service_information","title":"Service information","text":"In this section, the information on Storage services will be shown. The parameter description is as follows:
Parameter DescriptionHost
The IP address of the host. Port
The port of the host. Status
The host status. Git Info Sha
The commit ID of the current version. Leader Count
The number of Leaders. Partition Distribution
The distribution of partitions. Leader Distribution
The distribution of Leaders. Click Detail in the upper right corner to view the details of the Storage service information.
"},{"location":"nebula-dashboard/4.use-dashboard/#partition_distribution","title":"Partition Distribution","text":"Select the specified graph space in the upper left corner, you can view the distribution of partitions in the specified graph space. You can see the IP addresses and ports of all Storage services in the cluster, and the number of partitions in each Storage service.
Click Detail in the upper right corner to view more details.
"},{"location":"nebula-dashboard/4.use-dashboard/#partition_information","title":"Partition information","text":"In this section, the information on partitions will be shown. Before viewing the partition information, you need to select a graph space in the upper left corner. The parameter description is as follows:
Parameter DescriptionPartition ID
The ID of the partition. Leader
The IP address and port of the leader. Peers
The IP addresses and ports of all the replicas. Losts
The IP addresses and ports of faulty replicas. Click Detail in the upper right corner to view details. You can also enter the partition ID into the input box in the upper right corner of the details page to filter the shown data.
"},{"location":"nebula-dashboard/4.use-dashboard/#config","title":"Config","text":"It shows the configuration of the NebulaGraph service. NebulaGraph Dashboard Community Edition does not support online modification of configurations for now.
"},{"location":"nebula-dashboard/4.use-dashboard/#others","title":"Others","text":"In the lower left corner of the page, you can:
This topic will describe the monitoring metrics in NebulaGraph Dashboard.
"},{"location":"nebula-dashboard/6.monitor-parameter/#machine","title":"Machine","text":"Note
cpu_utilization
The percentage of used CPU. cpu_idle
The percentage of idled CPU. cpu_wait
The percentage of CPU waiting for IO operations. cpu_user
The percentage of CPU used by users. cpu_system
The percentage of CPU used by the system."},{"location":"nebula-dashboard/6.monitor-parameter/#memory","title":"Memory","text":"Parameter Description memory_utilization
The percentage of used memory. memory_used
The memory space used (not including caches). memory_free
The memory space available."},{"location":"nebula-dashboard/6.monitor-parameter/#load","title":"Load","text":"Parameter Description load_1m
The average load of the system in the last 1 minute. load_5m
The average load of the system in the last 5 minutes. load_15m
The average load of the system in the last 15 minutes."},{"location":"nebula-dashboard/6.monitor-parameter/#disk","title":"Disk","text":"Parameter Description disk_used_percentage
The disk utilization percentage. disk_used
The disk space used. disk_free
The disk space available. disk_readbytes
The number of bytes that the system reads in the disk per second. disk_writebytes
The number of bytes that the system writes in the disk per second. disk_readiops
The number of read queries that the disk receives per second. disk_writeiops
The number of write queries that the disk receives per second. inode_utilization
The percentage of used inode."},{"location":"nebula-dashboard/6.monitor-parameter/#network","title":"Network","text":"Parameter Description network_in_rate
The number of bytes that the network card receives per second. network_out_rate
The number of bytes that the network card sends out per second. network_in_errs
The number of wrong bytes that the network card receives per second. network_out_errs
The number of wrong bytes that the network card sends out per second. network_in_packets
The number of data packages that the network card receives per second. network_out_packets
The number of data packages that the network card sends out per second."},{"location":"nebula-dashboard/6.monitor-parameter/#service","title":"Service","text":""},{"location":"nebula-dashboard/6.monitor-parameter/#period","title":"Period","text":"The period is the time range of counting metrics. It currently supports 5 seconds, 60 seconds, 600 seconds, and 3600 seconds, which respectively represent the last 5 seconds, the last 1 minute, the last 10 minutes, and the last 1 hour.
"},{"location":"nebula-dashboard/6.monitor-parameter/#metric_methods","title":"Metric methods","text":"Parameter Descriptionrate
The average rate of operations per second in a period. sum
The sum of operations in the period. avg
The average latency in the cycle. P75
The 75th percentile latency. P95
The 95th percentile latency. P99
The 99th percentile latency. P999
The 99.9th percentile latency. Note
Dashboard collects the following metrics from the NebulaGraph core, but only shows the metrics that are important to it.
"},{"location":"nebula-dashboard/6.monitor-parameter/#graph","title":"Graph","text":"Parameter Descriptionnum_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"nebula-dashboard/6.monitor-parameter/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"nebula-dashboard/6.monitor-parameter/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"nebula-dashboard/6.monitor-parameter/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"nebula-studio/system-settings/","title":"Global settings","text":"This topic introduces the global settings of NebulaGraph Studio, including language switching and beta functions.
Beta functions: Switch on/off beta features, which include view schema, text to query and AI import.
The text to query and AI import features need to be configured with AI-related configurations. See below for detailed configurations.
The text to query and AI import are artificial intelligence features developed based on the large language model (LLM) and require the following parameters to be configured.
Parameter Description API type The API type for AI. Valid values areOpenAI
and Aliyun
. URL The API URL. Fill in the correct URL format according to the corresponding API type. For example, https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/chat/completions?api-version={api-version}
\u3002 Key The key used to validate the API. The key is required when using an online large language model, and is optional depending on the actual settings when using an offline large language model. Model The version of the large language model. The model is required when using an online large language model, and is optional depending on the actual settings when using an offline large language model. Max text length The maximum length for receiving or generating a single piece of text. Unit: byte."},{"location":"nebula-studio/about-studio/st-ug-limitations/","title":"Limitations","text":"This topic introduces the limitations of Studio.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#architecture","title":"Architecture","text":"For now, Studio v3.x supports x86_64 architecture only.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#upload_data","title":"Upload data","text":"Only CSV files without headers can be uploaded, but no limitations are applied to the size and store period for a single file. The maximum data volume depends on the storage capacity of your machine.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#data_backup","title":"Data backup","text":"For now, only supports exporting query results in CSV format on Console, and other data backup methods are not supported.
"},{"location":"nebula-studio/about-studio/st-ug-limitations/#ngql_statements","title":"nGQL statements","text":"On the Console page of Docker-based and RPM-based Studio v3.x, all the nGQL syntaxes except these are supported:
USE <space_name>
: You cannot run such a statement on the Console page to choose a graph space. As an alternative, you can click a graph space name in the drop-down list of Current Graph Space.We recommend that you use the latest version of Chrome to get access to Studio. Otherwise, some features may not work properly.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/","title":"What is NebulaGraph Studio","text":"NebulaGraph Studio (Studio in short) is a browser-based visualization tool to manage NebulaGraph. It provides you with a graphical user interface to manipulate graph schemas, import data, and run nGQL statements to retrieve data. With Studio, you can quickly become a graph exploration expert from scratch. You can view the latest source code in the NebulaGraph GitHub repository, see nebula-studio for details.
Note
You can also try some functions online in Studio.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#deployment","title":"Deployment","text":"In addition to deploying Studio with RPM-based, DEB-based, or Tar-based packages, or with Docker, you can also deploy Studio with Helm in the Kubernetes cluster. For more information, see Deploy Studio.
The functions of the above four deployment methods are the same and may be restricted when using Studio. For more information, see Limitations.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#features","title":"Features","text":"Studio can easily manage NebulaGraph data, with the following functions:
You can use Studio in one of these scenarios:
Authentication is not enabled in NebulaGraph by default. Users can log into Studio with the root
account and any password.
When NebulaGraph enables authentication, users can only sign into Studio with the specified account. For more information, see Authentication.
"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#version_compatibility","title":"Version compatibility","text":"Note
The Studio version is released independently of the NebulaGraph core. The correspondence between the versions of Studio and the NebulaGraph core, as shown in the table below.
NebulaGraph version Studio version 3.6.0 3.8.0, 3.7.0 3.5.0 3.7.0 3.4.0 ~ 3.4.1 3.7.0\u30013.6.0\u30013.5.1\u30013.5.0 3.3.0 3.5.1\u30013.5.0 3.0.0 \uff5e 3.2.0 3.4.1\u30013.4.0 3.1.0 3.3.2 3.0.0 3.2.x 2.6.x 3.1.x 2.6.x 3.1.x 2.0 & 2.0.1 2.x 1.x 1.x"},{"location":"nebula-studio/about-studio/st-ug-what-is-graph-studio/#check_updates","title":"Check updates","text":"Studio is in development. Users can view the latest releases features through Changelog.
To view the Changelog, on the upper-right corner of the page, click the version and then New version.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/","title":"Connect to NebulaGraph","text":"After successfully launching Studio, you need to configure to connect to NebulaGraph. This topic describes how Studio connects to the NebulaGraph database.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#prerequisites","title":"Prerequisites","text":"Before connecting to the NebulaGraph database, you need to confirm the following information:
9669
. To connect Studio to NebulaGraph, follow these steps:
Type http://<ip_address>:7001
in the address bar of your browser.
The following login page shows that Studio starts successfully.
On the Config Server page of Studio, configure these fields:
Graphd IP address: Enter the IP address of the Graph service of NebulaGraph. For example, 192.168.10.100
.
Note
127.0.0.1
or localhost
.9669
.Username and Password: Fill in the log in account according to the authentication settings of NebulaGraph.
root
and any password as the username and its password.root
and nebula
as the username and its password.After the configuration, click the Connect button.
Note
One session continues for up to 30 minutes. If you do not operate Studio within 30 minutes, the active session will time out and you must connect to NebulaGraph again.
A welcome page is displayed on the first login, showing the relevant functions according to the usage process, and the test datasets can be automatically downloaded and imported.
To visit the welcome page, click .
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#next_to_do","title":"Next to do","text":"When Studio is successfully connected to NebulaGraph, you can do these operations:
Note
The permissions of an account determine the operations that can be performed. For details, see Roles and privileges.
"},{"location":"nebula-studio/deploy-connect/st-ug-connect/#log_out","title":"Log out","text":"If you want to reconnect to NebulaGraph, you can log out and reconfigure the database.
Click the user profile picture in the upper right corner, and choose Log out.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/","title":"Deploy Studio","text":"This topic describes how to deploy Studio locally by RPM, DEB, tar package and Docker.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#rpm-based_studio","title":"RPM-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites","title":"Prerequisites","text":"Before you deploy RPM-based Studio, you must confirm that:
lsof
.Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by Studio.Select and download the RPM package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Checksum NebulaGraph version nebula-graph-studio-3.9.0.x86_64.rpm nebula-graph-studio-3.9.0.x86_64.rpm.sha256 masterUse sudo rpm -i <rpm_name>
to install RPM package.
For example, install Studio 3.9.0, use the following command. The default installation path is /usr/local/nebula-graph-studio
.
$ sudo rpm -i nebula-graph-studio-3.9.0.x86_64.rpm\n
You can also install it to the specified path using the following command:
$ sudo rpm -i nebula-graph-studio-3.9.0.x86_64.rpm --prefix=<path> \n
When the screen returns the following message, it means that the PRM-based Studio has been successfully started.
Start installing NebulaGraph Studio now...\nNebulaGraph Studio has been installed.\nNebulaGraph Studio started automatically.\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can uninstall Studio using the following command:
$ sudo rpm -e nebula-graph-studio-3.9.0.x86_64\n
If these lines are returned, PRM-based Studio has been uninstalled.
NebulaGraph Studio removed, bye~\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#exception_handling","title":"Exception handling","text":"If the automatic start fails during the installation process or you want to manually start or stop the service, use the following command:
$ bash /usr/local/nebula-graph-studio/scripts/rpm/start.sh\n
$ bash /usr/local/nebula-graph-studio/scripts/rpm/stop.sh\n
If you encounter an error bind EADDRINUSE 0.0.0.0:7001
when starting the service, you can use the following command to check port 7001 usage.
$ lsof -i:7001\n
If the port is occupied and the process on that port cannot be terminated, you can modify the startup port within the studio configuration and restart the service.
//Modify the studio service configuration. The default path to the configuration file is `/usr/local/nebula-graph-studio`.\n$ vi etc/studio-api.yam\n\n//Modify this port number and change it to any \nPort: 7001\n\n//Restart service\n$ systemctl restart nebula-graph-studio.service\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#deb-based_studio","title":"DEB-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_1","title":"Prerequisites","text":"Before you deploy DEB-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by Studio/usr/lib/systemd/system
exists in the system. If not, create it manually.Select and download the DEB package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Checksum NebulaGraph version nebula-graph-studio-3.9.0.x86_64.deb nebula-graph-studio-3.9.0.x86_64.deb.sha256 masterUse sudo dpkg -i <deb_name>
to install DEB package.
For example, install Studio 3.9.0, use the following command:
$ sudo dpkg -i nebula-graph-studio-3.9.0.x86_64.deb\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can uninstall Studio using the following command:
$ sudo dpkg -r nebula-graph-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#tar-based_studio","title":"tar-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_2","title":"Prerequisites","text":"Before you deploy tar-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by StudioSelect and download the tar package according to your needs. It is recommended to select the latest version. Common links are as follows:
Installation package Studio version nebula-graph-studio-3.9.0.x86_64.tar.gz 3.9.0Use tar -xvf
to decompress the tar package.
$ tar -xvf nebula-graph-studio-3.9.0.x86_64.tar.gz\n
Deploy and start nebula-graph-studio.
$ cd nebula-graph-studio\n$ ./server\n
When Studio is started, use http://<ip address>:7001
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
You can use kill pid
to stop the service:
$ kill $(lsof -t -i :7001) #stop nebula-graph-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#docker-based_studio","title":"Docker-based Studio","text":""},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_3","title":"Prerequisites","text":"Before you deploy Docker-based Studio, you must do a check of these:
Before the installation starts, the following ports are not occupied.
Port Description 7001 Web service provided by StudioTo deploy and start Docker-based Studio, run the following commands. Here we use NebulaGraph vmaster for demonstration:
Download the configuration files for the deployment.
Installation package NebulaGraph version nebula-graph-studio-3.9.0.tar.gz masterCreate the nebula-graph-studio-3.9.0
directory and decompress the installation package to the directory.
$ mkdir nebula-graph-studio-3.9.0 -zxvf nebula-graph-studio-3.9.0.gz -C nebula-graph-studio-3.9.0\n
Change to the nebula-graph-studio-3.9.0
directory.
$ cd nebula-graph-studio-3.9.0\n
Pull the Docker image of Studio.
$ docker-compose pull\n
Build and start Docker-based Studio. In this command, -d
is to run the containers in the background.
$ docker-compose up -d\n
If these lines are returned, Docker-based Studio v3.x is deployed and started.
Creating docker_web_1 ... done\n
When Docker-based Studio is started, use http://<ip address>:7001
to get access to Studio.
Note
Run ifconfig
or ipconfig
to get the IP address of the machine where Docker-based Studio is running. On the machine running Docker-based Studio, you can use http://localhost:7001
to get access to Studio.
If you can see the Config Server page on the browser, Docker-based Studio is started successfully.
This section describes how to deploy Studio with Helm.
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#prerequisites_4","title":"Prerequisites","text":"Before installing Studio, you need to install the following software and ensure the correct version of the software:
Software Requirement Kubernetes >= 1.14 Helm >= 3.2.0"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#install_2","title":"Install","text":"Use Git to clone the source code of Studio to the host.
$ git clone https://github.com/vesoft-inc/nebula-studio.git\n
Make the nebula-studio
directory the current working directory.
bash $ cd nebula-studio
Assume using release name:my-studio
, installed Studio in Helm Chart.
$ helm upgrade --install my-studio --set service.type=NodePort --set service.port=30070 deployment/helm\n
The configuration parameters of the Helm Chart are described below.
Parameter Default value Description replicaCount 0 The number of replicas for Deployment. image.nebulaStudio.name vesoft/nebula-graph-studio The image name of nebula-graph-studio. image.nebulaStudio.version v3.9.0 The image version of nebula-graph-studio. service.type ClusterIP The service type, which should be one ofNodePort
, ClusterIP
, and LoadBalancer
. service.port 7001 The expose port for nebula-graph-studio's web. service.nodePort 32701 The proxy port for accessing nebula-studio outside kubernetes cluster. resources.nebulaStudio {} The resource limits/requests for nebula-studio. persistent.storageClassName \"\" The name of storageClass. The default value will be used if not specified. persistent.size 5Gi The persistent volume size. When Studio is started, use http://<node_address>:30070/
to get access to Studio.
If you can see the Config Server page on the browser, Studio is started successfully.
$ helm uninstall my-studio\n
"},{"location":"nebula-studio/deploy-connect/st-ug-deploy/#next_to_do","title":"Next to do","text":"On the Config Server page, connect Docker-based Studio to NebulaGraph. For more information, see Connect to NebulaGraph.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/","title":"Manage edge types","text":"After a graph space is created in NebulaGraph, you can create edge types. With Studio, you can choose to use the Console page or the Schema page to create, retrieve, update, or delete edge types. This topic introduces how to use the Schema page to operate edge types in a graph space only.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/#prerequisites","title":"Prerequisites","text":"To operate an edge type on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab and click the + Create button.
On the Create Edge Type page, do these settings:
serve
is used.Define Properties (Optional): If necessary, click + Add Property to do these settings:
TTL_COL
and TTL_ DURATION
(in seconds). For more information about both parameters, see TTL configuration.When the preceding settings are completed, in the Equivalent to the following nGQL statement panel, you can see the nGQL statement equivalent to these settings.
Confirm the settings and then click the + Create button.
When the edge type is created successfully, the Define Properties panel shows all its properties on the list.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-edge-type/#edit_an_edge_type","title":"Edit an edge type","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab, find an edge type and then click the button in the Operations column.
On the Edit page, do these operations:
Comment
.To edit the TTL configuration: On the Set TTL panel, click Edit and then change the configuration of TTL_COL
and TTL_DURATION
(in seconds).
Note
For information about the coexistence problem of TTL and index, see TTL.
Danger
Confirm the impact before deleting the Edge type. The deleted data cannot be restored if it is not backup.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Edge Type tab, find an edge type and then click the button in the Operations column.
Click OK to confirm in the pop-up dialog box.
After the edge type is created, you can use the Console page to insert edge data one by one manually or use the Import page to bulk import edge data.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/","title":"Manage indexes","text":"You can create an index for a Tag and/or an Edge type. An index lets traversal start from vertices or edges with the same property and it can make a query more efficient. With Studio, you can use the Console page or the Schema page to create, retrieve, and delete indexes. This topic introduces how to use the Schema page to operate an index only.
Note
You can create an index when a Tag or an Edge Type is created. But an index can decrease the write speed during data import. We recommend that you import data firstly and then create and rebuild an index. For more information, see Index overview.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/#prerequisites","title":"Prerequisites","text":"To operate an index on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab and then click the + Create button.
On the Create page, do these settings:
Indexed Properties (Optional): Click Add property, and then, in the dialog box, choose a property. If necessary, repeat this step to choose more properties. You can drag the properties to sort them. In this example, degree
is chosen.
Note
The order of the indexed properties has an effect on the result of the LOOKUP
statement. For more information, see nGQL Manual.
When the settings are done, the Equivalent to the following nGQL statement panel shows the statement equivalent to the settings.
Confirm the settings and then click the + Create button. When an index is created, the index list shows the new index.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, in the upper left corner, choose an index type, Tag or Edge Type.
In the list, find an index and click its row. All its details are shown in the expanded row.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, in the upper left corner, choose an index type, Tag or Edge Type.
Click the Index tab, find an index and then click the button Rebuild in the Operations column.
Note
For more Information, see REBUILD INDEX.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-index/#delete_an_index","title":"Delete an index","text":"To delete an index on Schema, follow these steps:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Index tab, find an index and then click the button in the Operations column.
Click OK to confirm in the pop-up dialog box.
When Studio is connected to NebulaGraph, you can create or delete a graph space. You can use the Console page or the Schema page to do these operations. This article only introduces how to use the Schema page to operate graph spaces in NebulaGraph.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-space/#prerequisites","title":"Prerequisites","text":"To operate a graph space on the Schema page of Studio, you must do a check of these:
root
and any password to sign in to Studio.root
and its password to sign in to Studio.In the toolbar, click the Schema tab.
In the Graph Space List page, click Create Space, do these settings:
basketballplayer
is used. The name must be unique in the database.FIXED_STRING(<N>)
or INT64
. A graph space can only select one VID type. In this example, FIXED_STRING(32)
is used. For more information, see VID.Statistics of basketball players
is used.partition_num
and replica_factor
respectively. In this example, these parameters are set to 100
and 1
respectively. For more information, see CREATE SPACE
syntax.In the Equivalent to the following nGQL statement panel, you can see the statement equivalent to the preceding settings.
CREATE SPACE basketballplayer (partition_num = 100, replica_factor = 1, vid_type = FIXED_STRING(32)) COMMENT = \"Statistics of basketball players\"\n
Confirm the settings and then click the + Create button. If the graph space is created successfully, you can see it on the graph space list.
Danger
Deleting the space will delete all the data in it, and the deleted data cannot be restored if it is not backed up.
In the toolbar, click the Schema tab.
In the Graph Space List, find the space you want to be deleted, and click Delete Graph Space in the Operation column.
On the dialog box, confirm the information and then click OK.
After a graph space is created, you can create or edit a schema, including:
After a graph space is created in NebulaGraph, you can create tags. With Studio, you can use the Console page or the Schema page to create, retrieve, update, or delete tags. This topic introduces how to use the Schema page to operate tags in a graph space only.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-tag/#prerequisites","title":"Prerequisites","text":"To operate a tag on the Schema page of Studio, you must do a check of these:
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab and click the + Create button.
On the Create page, do these settings:
course
is specified.Define Properties (Optional): If necessary, click + Add Property to do these settings:
TTL_COL
and TTL_ DURATION
(in seconds). For more information about both parameters, see TTL configuration.When the preceding settings are completed, in the Equivalent to the following nGQL statement panel, you can see the nGQL statement equivalent to these settings.
Confirm the settings and then click the + Create button.
When the tag is created successfully, the Define Properties panel shows all its properties on the list.
"},{"location":"nebula-studio/manage-schema/st-ug-crud-tag/#edit_a_tag","title":"Edit a tag","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab, find a tag and then click the button in the Operations column.
On the Edit page, do these operations:
Comment
.To edit the TTL configuration: On the Set TTL panel, click Edit and then change the configuration of TTL_COL
and TTL_DURATION
(in seconds).
Note
For the problem of the coexistence of TTL and index, see TTL.
Danger
Confirm the impact before deleting the tag. The deleted data cannot be restored if it is not backup.
In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
In the Current Graph Space field, confirm the name of the graph space. If necessary, you can choose another name to change the graph space.
Click the Tag tab, find an tag and then click the button in the Operations column.
Click OK to confirm delete a tag in the pop-up dialog box.
After the tag is created, you can use the Console page to insert vertex data one by one manually or use the Import page to bulk import vertex data.
"},{"location":"nebula-studio/manage-schema/st-ug-view-schema/","title":"View Schema","text":"Users can visually view schemas in NebulaGraph Studio.
"},{"location":"nebula-studio/manage-schema/st-ug-view-schema/#steps","title":"Steps","text":"In the toolbar, click the Schema tab.
In the Graph Space List page, find a graph space and then click its name or click Schema in the Operations column.
Click View Schema tab and click the Get Schema button.
In the Graph Space List page, find a graph space and then perform the following operations in the Operations column:
Studio supports the schema drafting function. Users can design their schemas on the canvas to visually display the relationships between vertices and edges, and apply the schema to a specified graph space after the design is completed.
"},{"location":"nebula-studio/quick-start/draft/#features","title":"Features","text":"At the top navigation bar, click .
"},{"location":"nebula-studio/quick-start/draft/#design_schema","title":"Design schema","text":"The following steps take designing the schema of the basketballplayer
dataset as an example to demonstrate how to use the schema drafting function.
player
, and add two properties name
and age
.team
, and the property is name
.player
to the anchor point of the tag team
. Click the generated edge, fill in the name of the edge type as serve
, and add two properties start_year
and end_year
.player
to another one of its own. Click the generated edge, fill in the name of the edge type as follow
, and add a property degree
.Import the schema to a new or existing space, and click Confirm.
Note
Select the schema draft that you want to modify from the Draft list on the left side of the page. Click at the upper right corner after the modification.
Note
The graph space to which the schema has been applied will not be modified synchronously.
"},{"location":"nebula-studio/quick-start/draft/#delete_schema","title":"Delete schema","text":"Select the schema draft that you want to delete from the Draft list on the left side of the page, click X at the upper right corner of the thumbnail, and confirm to delete it.
"},{"location":"nebula-studio/quick-start/draft/#export_schema","title":"Export Schema","text":"Click at the upper right corner to export the schema as a PNG image.
"},{"location":"nebula-studio/quick-start/st-ug-console/","title":"Console","text":"Studio console interface is shown as follows.
"},{"location":"nebula-studio/quick-start/st-ug-console/#entry","title":"Entry","text":"In the top navigation bar, click Console.
"},{"location":"nebula-studio/quick-start/st-ug-console/#overview","title":"Overview","text":"The following table lists the functions on the console page.
number function descriptions 1 View the schema Display the schemas of the graph spaces. 2 Select a space Select a space in the graph space drop down list. The console does not support using theUSE <space_name>
statement to switch graph spaces. 3 Favorites Click the button to expand the favorites. Select a statement, and it automatically populates the input box. 4 History list Click the button to view the execution history. In the execution history list, click one of the statements, and the statement is automatically populates the input box. The list provides the record of the last 15 statements.Type /
in the input box to quickly select a historical query statement. 5 Clean input box Click the button to clear the content populated in the input box. 6 Run After entering the nGQL statement in the input box, click the button to start running the statement. 7 Input box The area where the nGQL statement is entered. The statement displays different colors depending on the schemas or character strings. Code auto-completion is supported. You can quickly enter a tag or edge type based on the schema.You can input multiple statements and run them at the same time by using the separator ;
. Use the symbol //
to add comments.Support right-clicking on a selected statement and then performing operations such as cut, copy, or run. 8 Custom parameters display Click the button to expand the custom parameters for the parameterized query. For details, see Manage parameters. 9 Statement running status After running the nGQL statement, the statement running status is displayed. If the statement runs successfully, the statement is displayed in green. If the statement fails, the statement is displayed in red. 10 Add to favorites Click the button to save the statement as a favorite. The button for the favorite statement is colored in yellow. 11 Export CSV file or PNG file After running the nGQL statement to return the result, when the result is in the Table window, click the button to export as a CSV file. Switch to the Graph window and click the button to export the results as a CSV file or a PNG image. 12 Expand/hide execution results Click the button to hide the result or click to expand the result. 13 Close execution results Click the button to close the result returned by this nGQL statement. 14 Table window Display the results returned by the nGQL statement in a table. 15 Plan window Display the execution plan. If an EXPLAIN
or PROFILE
statement is executed, the window presents the execution plan in visual form. See the description of the execution plan below. 16 Graph window Display the results returned by the nGQL statement in a graph if the results contain complete vertex and edge information. Click the button on the right to view the overview panel. 17 AI Assistant You can chat with an AI assistant to convert natural language instructions into nGQL query statements and then copy the nGQL statements into the input box with one click. This feature needs to be set up and enabled in the system settings before use.Note: The schema information of the current graph space is sent to the large language model when you chat with the assistant. Please pay attention to information security.You can click the text2match toggle to switch between general Q&A and query Q&A. The query Q&A can convert the natural language instructions to nGQL query statements."},{"location":"nebula-studio/quick-start/st-ug-console/#execution_plan_descriptions","title":"Execution plan descriptions","text":"The Studio can display the execution plan of the statement. The execution plan descriptions are as follows.
No. Description 1 AnEXPLAIN
or PROFILE
statement. 2 The operators used by the execution plan, which are sorted according to the execution duration. The top three operators are labeled as red, orange, and yellow, respectively. Clicking on an operator directly selects the corresponding operator in the operator execution flow and displays the operator information.Note: The PROFILE
statement actually executes the statement, and the actual execution durations can be obtained and sorted. The EXPLAIN
statement does not execute the statement, and all operators are considered to have the same execution duration and are all labeled as red. 3 The operator execution flow. For each operator, the following information is displayed: in-parameters, out-parameters, and total execution duration.The Select
, Loop
, PassThrough
, and Start
operators have independent color schemes.The arrows show the direction of data flow and the number of rows. The thicker the arrows, the more rows of data. You can click on the operator to check the details of the operator on the right side. 4 The details of the operator, divided into Profiling data
and Operator info
.Profiling data
shows the performance data of the operator, including the rows of data received, the execution time, the total time, etc.Operator info
shows the detailed operation information of the operator. 5 Zoom out, zoom in, or reverse the execution flow. 6 The duration of the statement. 7 Full screen or cancel full screen."},{"location":"nebula-studio/quick-start/st-ug-create-schema/","title":"Create a schema","text":"To batch import data into NebulaGraph, you must have a graph schema. You can create a schema on the Console page or on the Schema page of Studio.
Note
To create a graph schema on Studio, you must do a check of these:
Note
If no graph space exists and your account has the GOD privilege, you can create a graph space on the Console page. For more information, see CREATE SPACE.
"},{"location":"nebula-studio/quick-start/st-ug-create-schema/#create_a_schema_with_schema","title":"Create a schema with Schema","text":"Create tags. For more information, see Operate tags.
Create edge types. For more information, see Operate edge types.
In the toolbar, click the Console tab.
In the Current Graph Space field, choose a graph space name. In this example, basketballplayer is used.
In the input box, enter these statements one by one and click the button Run.
// To create a tag named \"player\", with two property\nnebula> CREATE TAG player(name string, age int);\n\n// To create a tag named \"team\", with one property\nnebula> CREATE TAG team(name string);\n\n// To create an edge type named \"follow\", with one properties\nnebula> CREATE EDGE follow(degree int);\n\n// To create an edge type named \"serve\", with two properties\nnebula> CREATE EDGE serve(start_year int, end_year int);\n
If the preceding statements are executed successfully, the schema is created. You can run the statements as follows to view the schema.
// To list all the tags in the current graph space\nnebula> SHOW TAGS;\n\n// To list all the edge types in the current graph space\nnebula> SHOW EDGES;\n\n// To view the definition of the tags and edge types\nDESCRIBE TAG player;\nDESCRIBE TAG team;\nDESCRIBE EDGE follow;\nDESCRIBE EDGE serve;\n
If the schema is created successfully, in the result window, you can see the definition of the tags and edge types.
"},{"location":"nebula-studio/quick-start/st-ug-create-schema/#next_to_do","title":"Next to do","text":"When a schema is created, you can import data.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/","title":"Import data","text":"Studio supports importing data in CSV format into NebulaGraph through an interface.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#prerequisites","title":"Prerequisites","text":"To batch import data, do a check of these:
In the top navigation bar, click Import.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#steps","title":"Steps","text":"Importing data is divided into 2 parts, creating a new data source and creating an import task, which will be described in detail next.
Note
You can also import tasks via the AI Import feature, which is a beta feature that needs to be enabled and configured in the system settings before use.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#create_a_new_data_source","title":"Create a new data source","text":"Click New Data Source in the upper right corner of the page to set the data source and its related settings. Currently, 3 types of data sources are supported.
Type of data source Description Cloud storage Add cloud storage as the CSV file source, which only supports cloud services compatible with the Amazon S3 interface. SFTP Add SFTP as the CSV file source. Local file Upload a local CSV file. The file size can not exceed 200 MB, please put the files exceeding the limit into other types of data sources.Note
Click New Import at the top left corner of the page to complete the following settings:
Caution
Users can also click Import Template to download the sample configuration file example.yaml
, configure it and then upload the configuration file. Configure in the same way as NebulaGraph Importer.
Map Tags:
NULL
or have DEFAULT
set, you can leave the corresponding column unspecified.After completing the settings, click Import, enter the password for the NebulaGraph account, and confirm.
After the import task is created, you can view the progress of the import task in the Import Data tab, which supports operations such as filtering tasks based on graph space, editing the task, viewing logs, downloading logs, reimporting, downloading configuration files, and deleting tasks.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#import_data_using_ai_import","title":"Import data using AI Import","text":"Note
After the import task is completed, check whether the data is imported successfully. If not, it is recommended that you check the task logs on the import page to see whether issues such as timeouts, privacy policy violations, service interruption, or encoding errors occurred.
Click AI Import in the upper left corner of the page to complete the following settings:
You can view the LLM
parameters related to AI import in the configuration file.
After completing the settings, click Next to confirm the file for import and the AI URL to be used, and then click Start.
After the import task is created, you can view the progress of the import task on the Import Data tab, which supports operations such as viewing logs, downloading logs, reimporting, and deleting tasks.
"},{"location":"nebula-studio/quick-start/st-ug-import-data/#next","title":"Next","text":"After completing the data import, users can access the Console page.
"},{"location":"nebula-studio/quick-start/st-ug-plan-schema/","title":"Design a schema","text":"To manipulate graph data in NebulaGraph with Studio, you must have a graph schema. This article introduces how to design a graph schema for NebulaGraph.
A graph schema for NebulaGraph must have these essential elements:
In this article, you can install the sample data set basketballplayer and use it to explore a pre-designed schema.
This table gives all the essential elements of the schema.
Element Name Property name (Data type) Description Tag player -name
(string
) - age
(int
) Represents the player. Tag team - name
(string
) Represents the team. Edge type serve - start_year
(int
) - end_year
(int
) Represent the players behavior.This behavior connects the player to the team, and the direction is from player to team. Edge type follow - degree
(int
) Represent the players behavior.This behavior connects the player to the player, and the direction is from a player to a player. This figure shows the relationship (serve/follow) between a player and a team.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/","title":"Connecting to the database error","text":""},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#problem_description","title":"Problem description","text":"According to the connect Studio operation, it prompts failed.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#possible_causes_and_solutions","title":"Possible causes and solutions","text":"You can troubleshoot the problem by following the steps below.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#step1_confirm_that_the_format_of_the_host_field_is_correct","title":"Step1: Confirm that the format of the Host field is correct","text":"You must fill in the IP address (graph_server_ip
) and port of the NebulaGraph database Graph service. If no changes are made, the port defaults to 9669
. Even if NebulaGraph and Studio are deployed on the current machine, you must use the local IP address instead of 127.0.0.1
, localhost
or 0.0.0.0
.
If authentication is not enabled, you can use root and any password as the username and its password.
If authentication is enabled and different users are created and assigned roles, users in different roles log in with their accounts and passwords.
"},{"location":"nebula-studio/troubleshooting/st-ug-config-server-errors/#step3_confirm_that_nebulagraph_service_is_normal","title":"Step3: Confirm that NebulaGraph service is normal","text":"Check NebulaGraph service status. Regarding the operation of viewing services:
If the NebulaGraph service is normal, proceed to Step 4 to continue troubleshooting. Otherwise, please restart NebulaGraph service.
Note
If you used docker-compose up -d
to satrt NebulaGraph before, you must run the docker-compose down
to stop NebulaGraph.
Run a command (for example, telnet 9669) on the Studio machine to confirm whether NebulaGraph's Graph service network connection is normal.
If the connection fails, check according to the following steps:
If you cannot connect to the NebulaGraph service after troubleshooting with the above steps, please go to the NebulaGraph forum for consultation.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/","title":"Cannot access to Studio","text":""},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#problem_description","title":"Problem description","text":"I follow the document description and visit 127.0.0.1:7001
or 0.0.0.0:7001
after starting Studio, why can\u2019t I open the page?
You can troubleshoot the problem by following the steps below.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step1_confirm_system_architecture","title":"Step1: Confirm system architecture","text":"It is necessary to confirm whether the machine where the Studio service is deployed is of x86_64 architecture. Currently, Studio only supports x86_64 architecture.
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step2_check_if_the_studio_service_starts_normally","title":"Step2: Check if the Studio service starts normally","text":"systemctl status nebula-graph-studio
to see the running status.sudo lsof -i:7001
to check port status.For Studio deployed with docker, use docker-compose ps
to see the running status. Run docker-compose ps
to check if the service has started normally.
If the service is normal, the return result is as follows. Among them, the State
column should all be displayed as Up
.
Name Command State Ports\n------------------------------------------------------------------------------------------------------\nnebula-web-docker_client_1 ./nebula-go-api Up 0.0.0.0:32782->8080/tcp\nnebula-web-docker_importer_1 nebula-importer --port=569 ... Up 0.0.0.0:32783->5699/tcp\nnebula-web-docker_nginx_1 /docker-entrypoint.sh ngin ... Up 0.0.0.0:7001->7001/tcp, 80/tcp\nnebula-web-docker_web_1 docker-entrypoint.sh npm r ... Up 0.0.0.0:32784->7001/tcp\n
If the above result is not returned, stop Studio and restart it first. For details, refer to Deploy Studio.
!!! note
If you used `docker-compose up -d` to satrt NebulaGraph before, you must run the `docker-compose down` to stop NebulaGraph.\n
"},{"location":"nebula-studio/troubleshooting/st-ug-connection-errors/#step3_confirm_address","title":"Step3: Confirm address","text":"If Studio and the browser are on the same machine, users can use localhost:7001
, 127.0.0.1:7001
or 0.0.0.0:7001
in the browser to access Studio.
If Studio and the browser are not on the same machine, you must enter <studio_server_ip>:7001
in the browser. Among them, studio_server_ip
refers to the IP address of the machine where the Studio service is deployed.
Run curl <studio_server_ip>:7001
-I to confirm if it is normal. If it returns HTTP/1.1 200 OK
, it means that the network is connected normally.
If the connection is refused, check according to the following steps:
If the connection fails, check according to the following steps:
If you cannot connect to the NebulaGraph service after troubleshooting with the above steps, please go to the NebulaGraph forum for consultation.
"},{"location":"nebula-studio/troubleshooting/st-ug-faq/","title":"FAQ","text":"Why can't I use a function?
If you find that a function cannot be used, it is recommended to troubleshoot the problem according to the following steps:
Confirm that NebulaGraph is the latest version. If you use Docker Compose to deploy the NebulaGraph database, it is recommended to run docker-compose pull && docker-compose up -d
to pull the latest Docker image and start the container.
Confirm that Studio is the latest version. For more information, refer to check updates.
Search the nebula forum, nebula and nebula-studio projects on the GitHub to confirm if there are already similar problems.
If none of the above steps solve the problem, you can submit a problem on the forum.
num_active_queries
The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. num_active_sessions
The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time.For example, when querying num_active_sessions.sum.5
, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is -20
(10-30). num_aggregate_executors
The number of executions for the Aggregation operator. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions_out_of_max_allowed
The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONS
was exceeded. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_indexscan_executors
The number of executions for index scan operators. num_killed_queries
The number of killed queries. num_opened_sessions
The number of sessions connected to the server. num_queries
The number of queries. num_query_errors_leader_changes
The number of the raft leader changes due to query errors. num_query_errors
The number of query errors. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. num_sentences
The number of statements received by the Graphd service. num_slow_queries
The number of slow queries. num_sort_executors
The number of executions for the Sort operator. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. slow_query_latency_us
The latency of slow queries. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. resp_part_completeness
The completeness of the partial success. You need to set accept_partial_success
to true
in the graph configuration first."},{"location":"reuse/source-monitoring-metrics/#meta","title":"Meta","text":"Parameter Description commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. heartbeat_latency_us
The latency of heartbeats. num_heartbeats
The number of heartbeats. num_raft_votes
The number of votes in Raft. transfer_leader_latency_us
The latency of transferring the raft leader. num_agent_heartbeats
The number of heartbeats for the AgentHBProcessor. agent_heartbeat_latency_us
The latency of the AgentHBProcessor. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_send_snapshot
The number of times that Raft sends snapshots to other nodes. append_log_latency_us
The latency of replicating the log record to a single node by Raft. append_wal_latency_us
The Raft write latency for a single WAL. num_grant_votes
The number of times that Raft votes for other nodes. num_start_elect
The number of times that Raft starts an election."},{"location":"reuse/source-monitoring-metrics/#storage","title":"Storage","text":"Parameter Description add_edges_latency_us
The latency of adding edges. add_vertices_latency_us
The latency of adding vertices. commit_log_latency_us
The latency of committing logs in Raft. commit_snapshot_latency_us
The latency of committing snapshots in Raft. delete_edges_latency_us
The latency of deleting edges. delete_vertices_latency_us
The latency of deleting vertices. get_neighbors_latency_us
The latency of querying neighbor vertices. get_dst_by_src_latency_us
The latency of querying the destination vertex by the source vertex. num_get_prop
The number of executions for the GetPropProcessor. num_get_neighbors_errors
The number of execution errors for the GetNeighborsProcessor. num_get_dst_by_src_errors
The number of execution errors for the GetDstBySrcProcessor. get_prop_latency_us
The latency of executions for the GetPropProcessor. num_edges_deleted
The number of deleted edges. num_edges_inserted
The number of inserted edges. num_raft_votes
The number of votes in Raft. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Storage service sent to the Meta service. num_rpc_sent_to_metad
The number of RPC requests that the Storaged service sent to the Metad service. num_tags_deleted
The number of deleted tags. num_vertices_deleted
The number of deleted vertices. num_vertices_inserted
The number of inserted vertices. transfer_leader_latency_us
The latency of transferring the raft leader. lookup_latency_us
The latency of executions for the LookupProcessor. num_lookup_errors
The number of execution errors for the LookupProcessor. num_scan_vertex
The number of executions for the ScanVertexProcessor. num_scan_vertex_errors
The number of execution errors for the ScanVertexProcessor. update_edge_latency_us
The latency of executions for the UpdateEdgeProcessor. num_update_vertex
The number of executions for the UpdateVertexProcessor. num_update_vertex_errors
The number of execution errors for the UpdateVertexProcessor. kv_get_latency_us
The latency of executions for the Getprocessor. kv_put_latency_us
The latency of executions for the PutProcessor. kv_remove_latency_us
The latency of executions for the RemoveProcessor. num_kv_get_errors
The number of execution errors for the GetProcessor. num_kv_get
The number of executions for the GetProcessor. num_kv_put_errors
The number of execution errors for the PutProcessor. num_kv_put
The number of executions for the PutProcessor. num_kv_remove_errors
The number of execution errors for the RemoveProcessor. num_kv_remove
The number of executions for the RemoveProcessor. forward_tranx_latency_us
The latency of transmission. scan_edge_latency_us
The latency of executions for the ScanEdgeProcessor. num_scan_edge_errors
The number of execution errors for the ScanEdgeProcessor. num_scan_edge
The number of executions for the ScanEdgeProcessor. scan_vertex_latency_us
The latency of executions for the ScanVertexProcessor. num_add_edges
The number of times that edges are added. num_add_edges_errors
The number of errors when adding edges. num_add_vertices
The number of times that vertices are added. num_start_elect
The number of times that Raft starts an election. num_add_vertices_errors
The number of errors when adding vertices. num_delete_vertices_errors
The number of errors when deleting vertices. append_log_latency_us
The latency of replicating the log record to a single node by Raft. num_grant_votes
The number of times that Raft votes for other nodes. replicate_log_latency_us
The latency of replicating the log record to most nodes by Raft. num_delete_tags
The number of times that tags are deleted. num_delete_tags_errors
The number of errors when deleting tags. num_delete_edges
The number of edge deletions. num_delete_edges_errors
The number of errors when deleting edges num_send_snapshot
The number of times that snapshots are sent. update_vertex_latency_us
The latency of executions for the UpdateVertexProcessor. append_wal_latency_us
The Raft write latency for a single WAL. num_update_edge
The number of executions for the UpdateEdgeProcessor. delete_tags_latency_us
The latency of deleting tags. num_update_edge_errors
The number of execution errors for the UpdateEdgeProcessor. num_get_neighbors
The number of executions for the GetNeighborsProcessor. num_get_dst_by_src
The number of executions for the GetDstBySrcProcessor. num_get_prop_errors
The number of execution errors for the GetPropProcessor. num_delete_vertices
The number of times that vertices are deleted. num_lookup
The number of executions for the LookupProcessor. num_sync_data
The number of times the Storage service synchronizes data from the Drainer. num_sync_data_errors
The number of errors that occur when the Storage service synchronizes data from the Drainer. sync_data_latency_us
The latency of the Storage service synchronizing data from the Drainer."},{"location":"reuse/source-monitoring-metrics/#graph_space","title":"Graph space","text":"Note
Space-level metrics are created dynamically, so that only when the behavior is triggered in the graph space, the corresponding metric is created and can be queried by the user.
Parameter Descriptionnum_active_queries
The number of queries currently being executed. num_queries
The number of queries. num_sentences
The number of statements received by the Graphd service. optimizer_latency_us
The latency of executing optimizer statements. query_latency_us
The latency of queries. num_slow_queries
The number of slow queries. num_query_errors
The number of query errors. num_query_errors_leader_changes
The number of raft leader changes due to query errors. num_killed_queries
The number of killed queries. num_aggregate_executors
The number of executions for the Aggregation operator. num_sort_executors
The number of executions for the Sort operator. num_indexscan_executors
The number of executions for index scan operators. num_auth_failed_sessions_bad_username_password
The number of sessions where authentication failed due to incorrect username and password. num_auth_failed_sessions
The number of sessions in which login authentication failed. num_opened_sessions
The number of sessions connected to the server. num_queries_hit_memory_watermark
The number of queries reached the memory watermark. num_reclaimed_expired_sessions
The number of expired sessions actively reclaimed by the server. num_rpc_sent_to_metad_failed
The number of failed RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_metad
The number of RPC requests that the Graphd service sent to the Metad service. num_rpc_sent_to_storaged_failed
The number of failed RPC requests that the Graphd service sent to the Storaged service. num_rpc_sent_to_storaged
The number of RPC requests that the Graphd service sent to the Storaged service. slow_query_latency_us
The latency of slow queries."},{"location":"reuse/source_connect-to-nebula-graph/","title":"Source connect to nebula graph","text":"This topic provides basic instruction on how to use the native CLI client NebulaGraph Console to connect to NebulaGraph.
Caution
When connecting to NebulaGraph for the first time, you must register the Storage Service before querying data.
NebulaGraph supports multiple types of clients, including a CLI client, a GUI client, and clients developed in popular programming languages. For more information, see the client list.
"},{"location":"reuse/source_connect-to-nebula-graph/#prerequisites","title":"Prerequisites","text":"The NebulaGraph Console version is compatible with the NebulaGraph version.
Note
NebulaGraph Console and NebulaGraph of the same version number are the most compatible. There may be compatibility issues when connecting to NebulaGraph with a different version of NebulaGraph Console. The error message incompatible version between client and server
is displayed when there is such an issue.
On the NebulaGraph Console releases page, select a NebulaGraph Console version and click Assets.
Note
It is recommended to select the latest version.
In the Assets area, find the correct binary file for the machine where you want to run NebulaGraph Console and download the file to the machine.
(Optional) Rename the binary file to nebula-console
for convenience.
Note
For Windows, rename the file to nebula-console.exe
.
On the machine to run NebulaGraph Console, grant the execute permission of the nebula-console binary file to the user.
Note
For Windows, skip this step.
$ chmod 111 nebula-console\n
In the command line interface, change the working directory to the one where the nebula-console binary file is stored.
Run the following command to connect to NebulaGraph.
$ ./nebula-console -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
> nebula-console.exe -addr <ip> -port <port> -u <username> -p <password>\n[-t 120] [-e \"nGQL_statement\" | -f filename.nGQL]\n
Parameter descriptions are as follows:
Parameter Description-h/-help
Shows the help menu. -addr/-address
Sets the IP (or hostname) of the Graph service. The default address is 127.0.0.1. -P/-port
Sets the port number of the graphd service. The default port number is 9669. -u/-user
Sets the username of your NebulaGraph account. Before enabling authentication, you can use any existing username. The default username is root
. -p/-password
Sets the password of your NebulaGraph account. Before enabling authentication, you can use any characters as the password. -t/-timeout
Sets an integer-type timeout threshold of the connection. The unit is millisecond. The default value is 120. -e/-eval
Sets a string-type nGQL statement. The nGQL statement is executed once the connection succeeds. The connection stops after the result is returned. -f/-file
Sets the path of an nGQL file. The nGQL statements in the file are executed once the connection succeeds. The result will be returned and the connection stops then. -enable_ssl
Enables SSL encryption when connecting to NebulaGraph. -ssl_root_ca_path
Sets the storage path of the certification authority file. -ssl_cert_path
Sets the storage path of the certificate file. -ssl_private_key_path
Sets the storage path of the private key file. For information on more parameters, see the project repository.
RPM and DEB are common package formats on Linux systems. This topic shows how to quickly install NebulaGraph with the RPM or DEB package.
Note
The console is not complied or packaged with NebulaGraph server binaries. You can install nebula-console by yourself.
"},{"location":"reuse/source_install-nebula-graph-by-rpm-or-deb/#prerequisites","title":"Prerequisites","text":"wget
is installed.Note
NebulaGraph is currently only supported for installation on Linux systems, and only CentOS 7.x, CentOS 8.x, Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 operating systems are supported.
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/<release_version>/nebula-graph-<release_version>.ubuntu2004.amd64.deb\n
For example, download the release package master
for Centos 7.5
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.el7.x86_64.rpm.sha256sum.txt\n
Download the release package master
for Ubuntu 1804
:
wget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/master/nebula-graph-master.ubuntu1804.amd64.deb.sha256sum.txt\n
Download the nightly version.
Danger
URL:
//Centos 7\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el7.x86_64.rpm\n\n//Centos 8\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.el8.x86_64.rpm\n\n//Ubuntu 1604\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1604.amd64.deb\n\n//Ubuntu 1804\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu1804.amd64.deb\n\n//Ubuntu 2004\nhttps://oss-cdn.nebula-graph.io/package/nightly/<yyyy.mm.dd>/nebula-graph-<yyyy.mm.dd>-nightly.ubuntu2004.amd64.deb\n
For example, download the Centos 7.5
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.el7.x86_64.rpm.sha256sum.txt\n
For example, download the Ubuntu 1804
package developed and built in 2021.11.28
:
wget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb\nwget https://oss-cdn.nebula-graph.io/package/nightly/2021.11.28/nebula-graph-2021.11.28-nightly.ubuntu1804.amd64.deb.sha256sum.txt\n
Use the following syntax to install with an RPM package.
$ sudo rpm -ivh --prefix=<installation_path> <package_name>\n
The option --prefix
indicates the installation path. The default path is /usr/local/nebula/
.
For example, to install an RPM package in the default path for the master version, run the following command.
sudo rpm -ivh nebula-graph-master.el7.x86_64.rpm\n
Use the following syntax to install with a DEB package.
$ sudo dpkg -i <package_name>\n
Note
Customizing the installation path is not supported when installing NebulaGraph with a DEB package. The default installation path is /usr/local/nebula/
.
For example, to install a DEB package for the master version, run the following command.
sudo dpkg -i nebula-graph-master.ubuntu1804.amd64.deb\n
Note
The default installation path is /usr/local/nebula/
.
NebulaGraph supports managing services with scripts.
"},{"location":"reuse/source_manage-service/#manage_services_with_script","title":"Manage services with script","text":"You can use the nebula.service
script to start, stop, restart, terminate, and check the NebulaGraph services.
Note
nebula.service
is stored in the /usr/local/nebula/scripts
directory by default. If you have customized the path, use the actual path in your environment.
$ sudo /usr/local/nebula/scripts/nebula.service\n[-v] [-c <config_file_path>]\n<start | stop | restart | kill | status>\n<metad | graphd | storaged | all>\n
Parameter Description -v
Display detailed debugging information. -c
Specify the configuration file path. The default path is /usr/local/nebula/etc/
. start
Start the target services. stop
Stop the target services. restart
Restart the target services. kill
Terminate the target services. status
Check the status of the target services. metad
Set the Meta Service as the target service. graphd
Set the Graph Service as the target service. storaged
Set the Storage Service as the target service. all
Set all the NebulaGraph services as the target services."},{"location":"reuse/source_manage-service/#start_nebulagraph","title":"Start NebulaGraph","text":"Run the following command to start NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service start all\n[INFO] Starting nebula-metad...\n[INFO] Done\n[INFO] Starting nebula-graphd...\n[INFO] Done\n[INFO] Starting nebula-storaged...\n[INFO] Done\n
"},{"location":"reuse/source_manage-service/#stop_nebulagraph","title":"Stop NebulaGraph","text":"Danger
Do not run kill -9
to forcibly terminate the processes. Otherwise, there is a low probability of data loss.
Run the following command to stop NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service stop all\n[INFO] Stopping nebula-metad...\n[INFO] Done\n[INFO] Stopping nebula-graphd...\n[INFO] Done\n[INFO] Stopping nebula-storaged...\n[INFO] Done\n
"},{"location":"reuse/source_manage-service/#check_the_service_status","title":"Check the service status","text":"Run the following command to check the service status of NebulaGraph.
$ sudo /usr/local/nebula/scripts/nebula.service status all\n
NebulaGraph is running normally if the following information is returned.
INFO] nebula-metad(33fd35e): Running as 29020, Listening on 9559\n[INFO] nebula-graphd(33fd35e): Running as 29095, Listening on 9669\n[WARN] nebula-storaged after v3.0.0 will not start service until it is added to cluster.\n[WARN] See Manage Storage hosts:ADD HOSTS in https://docs.nebula-graph.io/\n[INFO] nebula-storaged(33fd35e): Running as 29147, Listening on 9779\n
Note
After starting NebulaGraph, the port of the nebula-storaged
process is shown in red. Because the nebula-storaged
process waits for the nebula-metad
to add the current Storage service during the startup process. The Storage works after it receives the ready signal. Starting from NebulaGraph 3.0.0, the Meta service cannot directly read or write data in the Storage service that you add in the configuration file. The configuration file only registers the Storage service to the Meta service. You must run the ADD HOSTS
command to enable the Meta to read and write data in the Storage service. For more information, see Manage Storage hosts.
[INFO] nebula-metad: Running as 25600, Listening on 9559\n[INFO] nebula-graphd: Exited\n[INFO] nebula-storaged: Running as 25646, Listening on 9779\n
The NebulaGraph services consist of the Meta Service, Graph Service, and Storage Service. The configuration files for all three services are stored in the /usr/local/nebula/etc/
directory by default. You can check the configuration files according to the returned result to troubleshoot problems.
Connect to NebulaGraph
"},{"location":"synchronization-and-migration/2.balance-syntax/","title":"BALANCE syntax","text":"We can submit tasks to load balance Storage services in NebulaGraph. For more information about storage load balancing and examples, see Storage load balance.
Note
For other job management commands, see Job manager and the JOB statements.
The syntax for load balance is described as follows.
Syntax DescriptionSUBMIT JOB BALANCE LEADER
Starts a job to balance the distribution of all the storage leaders in all graph spaces. It returns the job ID. For details about how to view, stop, and restart a job, see Job manager and the JOB statements.
"}]} \ No newline at end of file