The Graph database is a rising trend in technology today. We hear discussions of why we should use it, how it’s different, what value it brings and just how cool the technology is. Developing technology for Bible Translation, we at Bridge Connectivity Solutions (BCS) have been thinking and asking relevant questions: Would graphs be suited for our requirements? What data would go in it? What improvements could it bring, if any? To answer these, we made a small exploration into the world of current Graph technologies, toyed around with a couple of databases and tried to model the data we have as graphs. Here is the story of our little journey and what we found out.
The relational databases have been around for a long time and have greatly influenced our thinking. I can tend to unknowingly model all data that I think of as tables. This also means that it is well established in its tooling. But Graph databases require us to move away from that. It has got a very different data structure. When relational databases claim to be about relations, graphs are actually all about relations. The data modelling in a Graph Database is about defining what becomes the nodes and how to connect them, via relations. Once you are a little familiar with the idea of nodes and edges, it becomes a more natural and intutive way to model data and even as a means of drawing out things on a white-board while you explain something complex. You will start to see that its not just the social-network data that has complex connections, but a lot of the data we work with today is deeply connected and fit to be modeled as graphs.
To begin with, we thought of trying out the Neo4j graph database, which is popular and has been used in the industry since 2007. Other than this, we are currently exploring dGraph database which is more recent and has some promising features and claims better performance.
We have an alignment project at BCS, as part of the AutographaMT platform, where we built a tool to align the words of the Bible in 12 Indian languages with the original languages- Greek and Hebrew. While working on that, we became familiarized with the data structures used to represent the Bible, various versions of the Bible, issues with versification mismatches between versions, etc. When we wanted to experiment on Graph database we thought of using the same data. It includes:
- UGNT Bible for Greek,
- IRV Bible in Hindi,
- ULT Bible for English,
- The Alignment we created between these(test/sample data, the ‘Quality Check’ stage of which has not been completed), and
- Marking up of Strong’s numbers and translation words in the UGNT Bible.
So, we have some data we wanted to build a graph with, the next challenge was the actual data modeling. Trying to do that, we found ourselves pondering over a bunch of questions: what becomes nodes, how to connect them with relationships efficiently, how to choose the data types, how to create indices, and the like. The book Graph Databases, by Ian Robinson, et al, came as a help here. It had a good introduction to the various concepts of Graph Databases and on how to model data effectively for a Graph Database.
By going through this process, one thing we realized for certain is that, graph databases are going to add more value and power to the data processing we do. It is going to be an integral part of the Vachan Engine we are building at BCS, to serve as a smart/intelligent data engine that powers various AI-aided Bible Translation applications.
That said, I am excited to give you access to a demo site we have setup with the data on Neo4j (because it had better looking visualizations) for the aligned Biblical data. You can try it out here. When it asks to you connect to the DB, use the following credential
And make sure, you are using an
http connection, not an
Here are a few queries you can try out. We would love to see the cool queries you come up with- please share them as comments to this post!
MATCH (n:BIBLE) RETURN n
MATCH p=((bib:BIBLE)<-[:BELONGS_TO]-(b:BOOK)<-[:BELONGS_TO]-(c:CHAPTER)<-[:BELONGS_TO]-(v:VERSE)<-[:BELONGS_TO]-(w:WORD)) WHERE bib.name='Hindi_IRV4' and b.number='40' and c.number='5' and v.number='3' RETURN p
or, for something a bit more sophisticated
MATCH p=((bib:BIBLE)<-[:BELONGS_TO]-(b:BOOK)<-[:BELONGS_TO]-(c:CHAPTER)<-[:BELONGS_TO]-(v:VERSE)<-[:BELONGS_TO]-(w:WORD)) WHERE b.number='40' and c.number='5' and v.number='3' RETURN p
And click your way into the depths of the Graph!
This video has more details about the work we did and a walk through for the database mentioned above.
Presenation on Graph DB