[neo4j系列八]大规模导入数据之neo4j-import

neo4j 1年前 ⋅ 2314 阅读

       对于将数据导入neo4j,比较常用的是LOAD CSV的方式,这种方法适用于中小规模数据,但对于大规模数据的导入,有三种方法Batch inserter(适用于java)、Batch import(基于Batch inserter编译好的jar包)和neo4j import。本文主要neo4j官方出品的neo4j-import,这种方法适用于未初始化数据库的数据导入。

       neo4j是neo4j-admin 中一个特殊命令而已,作为一名技术人员举一反三是必然的,所以把neo4j-admin全部命令系统学习和研究才是技术演进的方式。       

图片

  • Neo4j-admin 常用的command

neo4j-admin <command>

1. check-consistency 检查一致性

2. help 帮助

3. import 导入数据

4. memrec 内存使用推荐

5. store-info 数据仓存储的信息

6. set-default-admin 设置管理员账户

7. set-initial-password 设置初始密码

8. dump 数据备份

9. load 载入备份数据

  • import基本语法

neo4j-admin import [--mode=csv] [--database=<name>]

                         [--additional-config=<config-file-path>]

                         [--report-file=<filename>]

                         [--nodes[:Label1:Label2]=<"file1,file2,...">]

                         [--relationships[:RELATIONSHIP_TYPE]=<"file1,file2,...">]

                         [--id-type=<STRING|INTEGER|ACTUAL>]

                    [--input-encoding=<character-set>]

                         [--ignore-extra-columns[=<true|false>]]

                         [--ignore-duplicate-nodes[=<true|false>]]

                         [--ignore-missing-nodes[=<true|false>]]

                         [--multiline-fields[=<true|false>]]

                         [--delimiter=<delimiter-character>]

                         [--array-delimiter=<array-delimiter-character>]

                         [--quote=<quotation-character>]

                         [--max-memory=<max-memory-that-importer-can-use>]

比如从三个文件导入数据:

neo4j-admin import --nodes import/movies.csv --nodes import/ac
tors.csv --relationships import/roles.csv
  • 对于导入的数据CSV的设计原则

 

1.属性

<name> 字段名

数据类型 int, long, float, double, boolean, byte, short, char,string,默认为string

并列值的默认分隔符为; --array-delimiter用这个声明其他的分隔符

使用:IGNORE忽略字段的数据

2. node数据

<name>:ID 有一个字段定义id,必须存在,而且唯一

:LABEL 定义标签,用;分割多标签

例子如下:

   movieId:ID,title,year:int,:LABEL
tt0133093,"The Matrix",1999,Movie
tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
tt0242653,"The Matrix Revolutions",2003,Movie;Sequel

3.关系数据

       START_ID 起点id  END_ID 终点id  TYPE 关系类型

       例子如下:

:START_ID,role,:END_ID,:TYPE

keanu,"Neo",tt0133093,ACTED_IN

keanu,"Neo",tt0234215,ACTED_IN

keanu,"Neo",tt0242653,ACTED_IN

laurence,"Morpheus",tt0133093,ACTED_IN

laurence,"Morpheus",tt0234215,ACTED_IN

laurence,"Morpheus",tt0242653,ACTED_IN

carrieanne,"Trinity",tt0133093,ACTED_IN

carrieanne,"Trinity",tt0234215,ACTED_IN

carrieanne,"Trinity",tt0242653,ACTED_IN

4.ID命明空间

ID(<ID space identifier>) 

例子:

movieId:ID(Movie-ID),title,year:int,:LABEL
1,"The Matrix",1999,Movie
2,"The Matrix Reloaded",2003,Movie;Sequel
3,"The Matrix Revolutions",2003,Movie;Sequel

START_ID(<ID space identifier>)  END_ID(<ID space identifier>)

例子:

 

personId:ID(Actor-ID),name,:LABEL
1,"Keanu Reeves",Actor
2,"Laurence Fishburne",Actor
3,"Carrie-Anne Moss",Actor

5.跳过某一列IGNORE

personId:ID,name,:IGNORE,:LABEL
keanu,"Keanu Reeves","male",Actor
laurence,"Laurence Fishburne","male",Actor
carrieanne,"Carrie-Anne Moss","female",Actor
  • Dump 与Load

neo4j-admin dump--database=<database> --to=<destination-path>

例子

neo4j-admin dump --database=graph.db --to=/backups/graph.db/2016-10-02.dump

 

       neo4j-admin load --from=<archive-path>--database=<database> [--force]

      load数据需要关闭数据库

neo4j stop
Stopping Neo4j.. stopped
neo4j-admin load --from=/backups/graph.db/2016-10-02.dump --database=graph.db --force
  • Cypher Shell (在终端执行cypher语句)

cypher-shell [-h] [-aADDRESS] [-u USERNAME] [-p PASSWORD] [--encryption {true,false}] [--format{verbose,plain}] [--debug] [--non-interactive] [-v] [--fail-fast |--fail-at-end] [cypher]

cypher-shell -u johndoe -p secret

       执行一个查询脚本

cat examples.cypher | bin/cypher-shell -u neo4j -p secret --format plain
cat examples.cypher | bin/cypher-shell -u neo4j -p secret --format plain

 :param使用

neo4j> :param thisAlias 'Robin'
neo4j> :params
thisAlias: Robin

 

neo4j> CREATE (:Person {name : 'Dick Grayson', alias : {thisAlias} });
Added 1 nodes, Set 2 properties, Added 1 labels
neo4j> MATCH (n) RETURN n;
n
(:Person {name: "Bruce Wayne", alias: "Batman"})
(:Person {name: "Selina Kyle", alias: ["Catwoman", "The Cat"]})
(:Person {name: "Dick Grayson", alias: "Robin"})

 

       事务:begin, :commit,:rollback

neo4j> MATCH (n) RETURN n;
n
(:Person {name: "Bruce Wayne", alias: "Batman"})
(:Person {name: "Selina Kyle", alias: ["Catwoman", "The Cat"]})
(:Person {name: "Dick Grayson", alias: "Robin"})
neo4j> :begin
neo4j# CREATE (:Person {name : 'Edward Mygma', alias : 'The Riddler' });
Added 1 nodes, Set 2 properties, Added 1 labels

 

neo4j> MATCH (n) RETURN n;
n
(:Person {name: "Bruce Wayne", alias: "Batman"})
(:Person {name: "Selina Kyle", alias: ["Catwoman", "The Cat"]})
(:Person {name: "Dick Grayson", alias: "Robin"})

 

neo4j# :commit
neo4j> MATCH (n) RETURN n;
n
(:Person {name: "Bruce Wayne", alias: "Batman"})
(:Person {name: "Selina Kyle", alias: ["Catwoman", "The Cat"]})
(:Person {name: "Dick Grayson", alias: "Robin"})
(:Person {name: "Edward Mygma", alias: "The Riddler"})
neo4j>

 

查看用户dbms.showCurrentUser().

neo4j> CALL dbms.showCurrentUser();
username, roles, flags
"johndoe", ["admin"], []
neo4j> :exit

参考内容:https://neo4j.com/docs/operations-manual/current/tools/neo4j-admin/

 

更多内容请访问:IT源点

相关文章推荐

全部评论: 0

    我有话说: