Create Graph Structures from deeply nested JSON Documents
Recently, we did some tests with the current Structr version to see how well Structr can handle deeply nested JSON documents to create graph structures in Neo4j. We found out that it worked quite well up to the second level of object nesting, and only if new objects referenced existing objects either by their UUID or by a scalar value, if the corresponding mapping was defined in the schema.
When it comes to attributes of nested objects being referenced by more complex values, like collections or nested objects themselves, it became difficult in the sense that you had to define a rather complex object mapping in the schema, using things like Notion attributions (using PropertySetNotion
internally).
What's new?
Over the past days, Christian added some improvements to mitigate these issues, making it much easier to create nodes and relationships in Neo4j based on the schema rules in Structr.
Example
To demonstrate the new capabilities, we've created the following example.
Schema
The schema rules in Cypher notation:
(:Project)-[:TASK]->(:Task)
(:Task)-[:SUBTASK]->(:Task)
(:Task)<-[WORKS_ON]-(:Worker)
(:Worker)->[:WORKS_AT]->(:Company)
The cardinalities are:
- Project 1 -> * Task
- Task 1 -> * Task
- Worker 1 -> * Task
- Worker * -> 1 Company
To make the example work, it is important to overwrite the auto-naming and name the attributes exactly as in the example: tasks, parentTask, subtasks, company, worker, workers
etc..
Make sure the name
attribute is unique for each type.
Auto-creation Rules
Instead of using a PropertyNotion
with autocreate
flag, you only have to define the following auto-creation rules (ALWAYS) in the Schema Editor:
- Project -> Task
- Task -> Task
- Worker -> Task
- Worker -> Company
JSON Document
The following example JSON document contains all information about the objects to be created. Note that some sub-objects occur multiple times in the document, and if there's a unique attribute defined for their type (like e.g. the name
attribute for Project, Company and Worker), the object is only created once.
{
"name": "Project1",
"tasks": [
{
"name": "Task1",
"worker": {
"name": "Worker1",
"company": {
"name": "Company1"
}
},
"subtasks": [
{
"name": "Subtask1.1",
"worker": {
"name": "Worker1",
"company": {
"name": "Company1"
}
}
},
{
"name": "Subtask1.2",
"worker": {
"name": "Worker2",
"company": {
"name": "Company1"
}
}
},
{
"name": "Subtask1.3",
"worker": {
"name": "Worker2",
"company": {
"name": "Company1"
}
}
},
{
"name": "Subtask1.4",
"worker": {
"name": "Worker3",
"company": {
"name": "Company2"
}
}
}
]
},
{
"name": "Task2",
"worker": {
"name": "Worker2",
"company": {
"name": "Company1"
}
}
},
{
"name": "Task3",
"worker": {
"name": "Worker3",
"company": {
"name": "Company2"
}
}
},
{
"name": "Task4",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
},
"subtasks": [
{
"name": "Subtask4.1",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
},
{
"name": "Subtask4.2",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
},
{
"name": "Subtask4.3",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
},
{
"name": "Subtask4.4",
"worker": {
"name": "Worker5",
"company": {
"name": "Company3"
}
}
}
]
},
{
"name": "Task5",
"worker": {
"name": "Worker5",
"company": {
"name": "Company3"
}
},
"subtasks": [
{
"name": "Subtask5.1",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
},
"subtasks": [
{
"name": "Subtask5.1.1",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
},
{
"name": "Subtask5.1.2",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
}
]
},
{
"name": "Subtask5.2",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
},
"subtasks": [
{
"name": "Subtask5.2.1",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
},
{
"name": "Subtask5.2.2",
"worker": {
"name": "Worker4",
"company": {
"name": "Company3"
}
}
}
]
}
]
}
]
}
Just save this document to a file and POST it to the /projects
REST endpoint:
$ curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/projects" -XPOST -d @/tmp/project.json
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=rxjhtzyxetnj1l8dx6vg8aejq;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Location: http://0.0.0.0:8082/structr/rest/projects/fafeada746ef432196ee2ccfc7e362fc
Vary: Accept-Encoding, User-Agent
Content-Length: 121
Server: Jetty(9.1.4.v20140401)
{
"result_count": 1,
"result": [
"fafeada746ef432196ee2ccfc7e362fc"
],
"serialization_time": "0.000226953"
}
The complete graph was created, without creating any redundancy!
curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/projects/fafeada746ef432196ee2ccfc7e362fc/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=el6ri37v61wzeuoni7ilgrl0;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 1222
Server: Jetty(9.1.4.v20140401)
{
"query_time": "0.001580893",
"result_count": 1,
"result": {
"id": "fafeada746ef432196ee2ccfc7e362fc",
"name": "Project1",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Project",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:13+0100",
"lastModifiedDate": "2014-12-15T17:47:13+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"tasks": [
{
"id": "517c1a89e44f479eb0802b9045271b4c",
"name": "Task1"
},
{
"id": "dace6757bad94aa0a137420741406699",
"name": "Task2"
},
{
"id": "097729f47768469ebeaacd00ea8a442e",
"name": "Task3"
},
{
"id": "27bfdc1bb293458eab0d912811f610da",
"name": "Task4"
},
{
"id": "b762fd8f2fe24d149d4a220412c56f49",
"name": "Task5"
}
]
},
"serialization_time": "0.000166250"
}
curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/companies/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=1mc9hmhm2umrm1h60h1vt56o7c;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 2668
Server: Jetty(9.1.4.v20140401)
{
"query_time": "0.002427711",
"result_count": 3,
"result": [
{
"id": "bc9de3a97bce409da5234e5976355aa9",
"name": "Company1",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Company",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:12+0100",
"lastModifiedDate": "2014-12-15T17:47:12+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"workers": [
{
"id": "2da31b7175cb44c787ff93fe43c7e317",
"name": "Worker1"
},
{
"id": "d980f56aacea4a0bb2cb8cf7b3a740d5",
"name": "Worker2"
}
]
},
{
"id": "8fac823ef76f436c96cfc9e0c4c21fb5",
"name": "Company2",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Company",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:12+0100",
"lastModifiedDate": "2014-12-15T17:47:12+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"workers": [
{
"id": "a9a23f57b9ce4e33bcc1efbfd2537164",
"name": "Worker3"
}
]
},
{
"id": "eb34f6449d69484f93c69320fe95ea24",
"name": "Company3",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Company",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:13+0100",
"lastModifiedDate": "2014-12-15T17:47:13+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"workers": [
{
"id": "eadcb538f90a41838f0196fd74b82037",
"name": "Worker4"
},
{
"id": "8f8bd3613aa543ecae222da22cdd2e14",
"name": "Worker5"
}
]
}
],
"serialization_time": "0.000221961"
}
curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/workers/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=1b6g61a7uo1t016croq1fz2x41;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 6253
Server: Jetty(9.1.4.v20140401)
{
"query_time": "0.002056031",
"result_count": 5,
"result": [
{
"id": "2da31b7175cb44c787ff93fe43c7e317",
"name": "Worker1",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Worker",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:12+0100",
"lastModifiedDate": "2014-12-15T17:47:12+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"company": {
"id": "bc9de3a97bce409da5234e5976355aa9",
"name": "Company1"
},
"tasks": [
{
"id": "9988348105e34b1ab5d365f4e4f7262a",
"name": "Subtask1.1"
},
{
"id": "517c1a89e44f479eb0802b9045271b4c",
"name": "Task1"
}
]
},
{
"id": "d980f56aacea4a0bb2cb8cf7b3a740d5",
"name": "Worker2",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Worker",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:12+0100",
"lastModifiedDate": "2014-12-15T17:47:12+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"company": {
"id": "bc9de3a97bce409da5234e5976355aa9",
"name": "Company1"
},
"tasks": [
{
"id": "196189bde51a43afb587563fb47fda91",
"name": "Subtask1.2"
},
{
"id": "bd35947338924daa85b13391083551b1",
"name": "Subtask1.3"
},
{
"id": "dace6757bad94aa0a137420741406699",
"name": "Task2"
}
]
},
{
"id": "a9a23f57b9ce4e33bcc1efbfd2537164",
"name": "Worker3",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Worker",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:12+0100",
"lastModifiedDate": "2014-12-15T17:47:12+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"company": {
"id": "8fac823ef76f436c96cfc9e0c4c21fb5",
"name": "Company2"
},
"tasks": [
{
"id": "779b76705c9b43d598ce971024743b13",
"name": "Subtask1.4"
},
{
"id": "097729f47768469ebeaacd00ea8a442e",
"name": "Task3"
}
]
},
{
"id": "eadcb538f90a41838f0196fd74b82037",
"name": "Worker4",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Worker",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:13+0100",
"lastModifiedDate": "2014-12-15T17:47:13+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"company": {
"id": "eb34f6449d69484f93c69320fe95ea24",
"name": "Company3"
},
"tasks": [
{
"id": "8b2dbfcc93b94e428052ff1276991e34",
"name": "Subtask4.1"
},
{
"id": "f4b99c128cc24c518e6f9af5c3affea4",
"name": "Subtask4.2"
},
{
"id": "e6156b8b566e45349e671229ff70f9ea",
"name": "Subtask4.3"
},
{
"id": "27bfdc1bb293458eab0d912811f610da",
"name": "Task4"
},
{
"id": "5e45f0d22ee944ec84bc31aa75b40dda",
"name": "Subtask5.1.1"
},
{
"id": "2e21cad09d09423dacec07abcc763c3f",
"name": "Subtask5.1.2"
},
{
"id": "ad2e56c80ea944808b7082cdcab9f659",
"name": "Subtask5.1"
},
{
"id": "6173bc32066b40498147630d92c17990",
"name": "Subtask5.2.1"
},
{
"id": "83889c6a55f043c6bd69dc04614ff76f",
"name": "Subtask5.2.2"
},
{
"id": "e04e9f77fa444c40904b474f27bcdc61",
"name": "Subtask5.2"
}
]
},
{
"id": "8f8bd3613aa543ecae222da22cdd2e14",
"name": "Worker5",
"owner": {
"id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"name": "admin"
},
"type": "Worker",
"createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
"deleted": false,
"hidden": false,
"createdDate": "2014-12-15T17:47:13+0100",
"lastModifiedDate": "2014-12-15T17:47:13+0100",
"visibleToPublicUsers": false,
"visibleToAuthenticatedUsers": false,
"visibilityStartDate": null,
"visibilityEndDate": null,
"company": {
"id": "eb34f6449d69484f93c69320fe95ea24",
"name": "Company3"
},
"tasks": [
{
"id": "7785c5445d2147578bc8831797241e53",
"name": "Subtask4.4"
},
{
"id": "b762fd8f2fe24d149d4a220412c56f49",
"name": "Task5"
}
]
}
],
"serialization_time": "0.000484448"
}
Isn't that fascinating? :-)
How it works
Workflow
The parser (GSON) creates a nested structure of maps (JsonInput) from the JSON, which are recursively matched against the schema rules, starting from the innermost object. Structr uses a so-called "DeserializationStrategy" to find out whether a nested object already exists in the graph (and can therefore be linked directly), or whether it should be created according to the autocreation rules in the schema.
Recursive evaluation
When parsing the above JSON document, Structr looks for a Project with the name `Project1` and an array of Tasks with the names `Task1` to `Task5`. To look up the first Task with name `Task1`, Structr recursively calls the DeserializationStrategy to obtain the desired Task, and does this again for the Worker and the Workers' Company. Since the Company entity has no more nested elements, the recursion stops and Structr looks for a Company with the name `Company1` in the database. The company does not exist, so the autocreation settings cause it to be created and returned to the previous recursion level, where it is linked to the newly created Worker with the name `Worker1`.
This process recursively creates new entites, or fetches them from the database if they already exist, mapping the nested JSON document to a graph structure according to the schema rules.
Structr as a Document Database
The described feature will greatly enhance the document database capabilities of Structr, and be part of the upcoming 1.1 release.
You can find the test code for this particular example behind the following link: