Scripture Burrito 🌯 0.1.0 Beta Announcement

The Scripture Burrito working group is pleased to announce the release of the Scripture Burrito 0.1.0 Beta version of the specification!

Status

This standard is work in progress. Things may well change significantly before v1.0.0. At this point the proposal includes:

  • An explanation of the history and overall thinking behind the standard, particularly the metadata format
  • A section-by-section presentation of the metadata format
  • Notes on how the the metadata format may be extended, both informally and formally
  • Examples of metadata documents in XML
  • Examples of these metadata documents represented in JSON
  • An XML schema (RelaxNG) for the metadata

Feedback

Feedback may be provided here, in this forum, or via the Scripture Burrito Github issues.

The committee invites comments on all aspects of this documentation, but has identified some specific issues about which decisions need to be taken:

XML vs JSON for metadata

The current proposal specifies an XML schema for metadata, as well as a canonical way to represent that XML as JSON. This approach is derived from the current Digital Bible Library system. XML documents are relatively easy for non-technical users to understand, and XML technology is mature and widely-supported.

However, it would also be possible to use JSON as the canonical format for Scripture Burrito metadata. This would have benefits for some software stacks - particularly Javascript - and may make processing easier, but at the cost of reduced readability (especially for embedded XHTML).

(The committee would like to avoid specifying the format in both XML and JSON, as this is an invitation for arcane edge case incompatibilities. The current proposal does not do this, as the only validation model is for XML, and JSON is simply an expression of that validated XML.)

USFM and USX for Scripture Text

The current proposal is based on the Digital Bible Library, which chose USX because it can be validated rigorously. As a result of this choice, several large publishing workflows including YouVersion and API.Bible use USX.

Much of the Bible translation world uses USFM, which is familiar to
Bible translators, but which requires bespoke parsing tools, and which can be ambiguous in some circumstances. Also, USX contains machine-readable reference information that cannot be represented in USFM at this time. Valid USFM can be round-tripped to USX. USX cannot be round-tripped to USFM without losing the machine-readable references. Invalid USFM may not have an equivalent
representation in USX.

Paratext currently uses both USFM and USX internally at various points.

The committee’s current proposal is

  • USFM for translations in progress
  • USX for valid content, orientated towards publication (incremental or otherwise)

The committee would appreciate proposals for constructive and technically feasible alternatives.

Deadline

The window for feedback on Scripture Burrito 0.1.0 Beta extends until Friday 25th October 2019. Depending on the level of feedback, the committee hopes to produce v0.1.0 before the end of 2019. The development roadmap may be seen in the Github milestones here.

1 Like

USX is a fact of life and Paratext is freely available as an easy-to-use interpreter.
I agree that JSON is useful, but it isn’t very human-readable.
Perhaps there could be an official tool for changeing USX to JSON and back.

Hi @David_Instone-Brewer, note that the the discussion point is relative to the Scripture Burrito metadata itself, not the Scripture encoding. In other words, there are 2 distinct points of discussion:

  1. Should we use XML or JSON for the definition metadata.
  2. Should we continue to support both USFM and USX.

As odd as it may sound, we could decide to use JSON for the metadata and only USX for Scripture. Plenty of other SB flavors will have no XML in them so there isn’t really a contradiction here. In reality, it’s very likely that any tool dealing with Scripture Burritos will need XML tooling (either for data or metadata) and JSON tooling (again, either for metadata or data).

To be clear, I’m not trying to short circuit the discussion, just noting that the two points really are distinct.

To your last point, it’s possible that usfm-js or USFM Grammar could become that. Although, as you can tell by the name those are both dealing with USFM and not USX directly, though the JSON representation itself may not be different.

Thanks for the explanation.

Through recent discussions, the Scripture Burrito working group is planning to switch to JSON for our metadata specification in 0.2.0-beta. The plan is to use JSON Schema for defining the metadata specification. The XML schema will be fully deprecated in 0.2.0-beta.

This was one of the main points noted above and there has been a lot of discussion about it.

The primary reason for making this change is that JSON is JavaScript friendly and many organizations are using JavaScript frameworks. In addition, most other languages have adequate libraries for either XML or JSON, so it seemed that optimizing for JavaScript was acceptable. Of course, this will make browser based Scripture Burrito solutions easier to develop too.