Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upProposal: new version of dag-json #129
Conversation
|
|
||
| The internal data format is valid JSON but is **NOT** identical to the decoded | ||
| node data codecs produce. |
daviddias
Jun 11, 2019
Member
I do not understand why this is necessary. Is there a place with the rationale to make JSON not JSON?
I do not understand why this is necessary. Is there a place with the rationale to make JSON not JSON?
mikeal
Jun 11, 2019
Author
Member
Right now, we’re using a reserved key, ”/“, for encoding links and binary. Several people think that key reservation is highly problematic because it means there are certain data sets you just can encode.
This change would fix that changing the internal block format. The format would *still be valid JSON, it just wouldn’t have as close of a 1-to-1 match with the decoded form of the data.
However, this change may not be worth it. This PR is to show that we have a solution to the reserved key issue and that is what it would take to fix it.
Right now, we’re using a reserved key, ”/“, for encoding links and binary. Several people think that key reservation is highly problematic because it means there are certain data sets you just can encode.
This change would fix that changing the internal block format. The format would *still be valid JSON, it just wouldn’t have as close of a 1-to-1 match with the decoded form of the data.
However, this change may not be worth it. This PR is to show that we have a solution to the reserved key issue and that is what it would take to fix it.
| ```javascript | ||
| { "/": { "_": 1 }, | ||
| "data": { "hello": "world", { "obj": { "array": [0, 0] } } }, |
vmx
Jun 11, 2019
Member
So those [0, 0] are placeholders? So if you want to encode [CID, "some string"] it would be [0, "some string"]?
So those [0, 0] are placeholders? So if you want to encode [CID, "some string"] it would be [0, "some string"]?
mikeal
Jun 11, 2019
Author
Member
They are just placeholders to preserve the ordering of the array.
They are just placeholders to preserve the ordering of the array.
vmx
Jun 12, 2019
Member
I'd use null for that, but that's a minor detail. Perhaps the example could be expended to [null, "some string", null] to make it clearer that there are place holders which are mixed with actual values.
I'd use null for that, but that's a minor detail. Perhaps the example could be expended to [null, "some string", null] to make it clearer that there are place holders which are mixed with actual values.
mikeal
Jun 12, 2019
Author
Member
originally it was null but i went with 0 because null encoding is extra bytes :/
originally it was null but i went with 0 because null encoding is extra bytes :/
| ```javascript | ||
| { hello: 'world', | ||
| key: Buffer.from('value'), |
vmx
Jun 11, 2019
Member
I would really prefer if we'd move to an Browser first approach and not using Node.js primitives:
key: new TextEncoder().encode('value'),
I would really prefer if we'd move to an Browser first approach and not using Node.js primitives:
key: new TextEncoder().encode('value'),
mikeal
Jun 11, 2019
Author
Member
it’s just not entirely clear to everyone that this does binary encoding. perhaps we should just move to psuedo-code, something like Bytes(‘value’)
it’s just not entirely clear to everyone that this does binary encoding. perhaps we should just move to psuedo-code, something like Bytes(‘value’)
vmx
Jun 12, 2019
Member
That would work for me. I just don't want to give the impression it's a Node.js Buffer.
That would work for me. I just don't want to give the impression it's a Node.js Buffer.
|
Is the motivation just to support If we just want to support |
I’m a little confused by this use case, because the If you want to take existing JSON data and add links to it, you’re going to have to use an encoding interface anyway and all of these details are invisible. Is there a case I’m missing where you want
Yes, that’s the only thing we get out of this change. Using “/“ as a key will still be highly problematic for anyone using paths and selectors, but this would fix the issue at the codec level. I’m actually not too worried about “/“ being reserved in |
dag-json wasn't designed to be 100% compatible but still mostly compatible. I agree we should have a plain JSON format for users who just want JSON but the point of this format was to provide a familiar, JSON-like format that mostly "just works" (i.e., you can link up existing JSON objects). |
|
Another thing that might be worth keeping in mind. Having a format you can easily use as exchange format for tests: multiformats/rust-cid#17 (comment) |
|
@mikeal We decided to put this into design history, but after having a closer look I think the discussion can be easily be summed up. Instead of having it archived automatically I suggest that we create a proper exploration report like https://github.com/ipld/specs/blob/035683c97d0280de5e2d490822d63ad618a8acab/design/history/exploration-reports/2019.06-unixfsv2-spike-01.md. @mikeal if you think it's worth having it there, could you please create a new PR with that report? |
This spec change would resolve the outstanding issue of
dag-jsonnot being able to represent the “/“ key.Since “/“ is already reserved, we use that key for the version definition, which makes this new version backwards compatible with all of the existing
dag-jsondata.This change would also set up
dag-jsonto store data outside of the actual node data, which could be used similar to how tags are used in cbor or for something like ipld/js-composites#3