sHINER: Entity Resolution with Graph Generating Dependencies

sHINER is a tool made for validating and repairing a new class of graph dependencies named Graph Generating Dependencies (GGDs). The Graph Generating Dependencies is a new class of graph dependencies proposed for property graphs inspired by the tuple- and equality-generating dependencies for relational data.

The GGDs can be applied in different scenarios and use cases. In the context of the SmartDataLake project, the GGDs were applied to solve the task of entity resolution. The generation of new vertices and/or edges in case a GGD is violated gives the possibility to rewrite ER matching rules or conditions as GGDs and generate links (edges) between entities that refer to the same real-world entity. One of the main advantages of using the GGDs to express these matching rules is that we can also encode more information than just vertex-to-vertex, or row-to-row in relational databases, as we consider all the information in a defined graph pattern.

This component also includes functions for querying graph dataset using the G-Core query language as well as functions that support the hierarchical graph visualization tab in the SDL-Vis component. All of the functionalities of the components have been integrated to be used with the SmartDataLake architecture.

For more information about the sHINER component and all of its functionalities see here.
For more information about the Graph Generating Dependencies and its syntax see our publication here.

Also join our Pitch presentation about our research on GGDs at the TU/e EAISI Summit 2021. The event will be streamed online and it is free of charge.

Back