-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: support JSON/Protobuf #2969
Comments
this'is a very usefull feature for semi-structured data |
problem is how we are going to select/mutate this columns. if they are treated as text, then what is the value? couldn't clients just unmarshal that text into whatever structure they have? how we filter based on that JSON document content (select * from foo where JSONCOLUMN.Person.Id = 1 ?) what about updating part of document? how we run aggregate queries on values? |
See also #3895 (which had a couple of thumbs up, much discussion and in particular linked to https://github.com/mgutz/dat, the Go Postgres Data Access Toolkit). |
All, Thanks! |
@Linicks We haven't even sketched out how a JSON/Protobuf column type would work, so it is hard to say with any certainty what the limits would be. If I were to guess, though, there wouldn't be any specific limit on the number of JSON/Protobuf columns within a table, but overall row size limits (for which we don't have any hard setting at the moment) would still apply. |
It's great to see that support for JSON datatype is on the 1.0 roadmap. Would we be able to index columns of this datatype in a way that'll allow efficient querying of object paths? In Postgres, for example, columns of type |
@kulshekhar We haven't sketched out the design yet so it's difficult to answer your question with any certainty. I would guess the answer is yes and we'll definitely look at the Postgres |
@petermattis we've got need for jsonb column types and functions in project I'm working on. We would love to contribute code given spec is agreed on. |
+1 for Protocol Buffers support. JSON is of course very valid too, however we do have a specific need for ProtoBuf. The challenge for JSON is choosing the correct data type for a given field without any prior knowledge of the field in order to create an index on that field. (I believe Postgres' Of course, in order to create a Protocol Buffer field in Cockroach that is both indexable and queryable, Coachroach needs to be aware of the schema. You'd need to be able to assign a ProtoBuf schema to a field, and also be able to update that schema (ideally at runtime, so that no downtime is required to upgrade a Protocol Buffer field's schema.) To my knowledge, CockroachDB would be the only database offering such a feature, but it would be hugely desirable to anyone looking to use ProtoBuf in a database. It might be helpful to take a look at this, which I just found: https://github.com/google/mysql-protobuf/wiki Mind you I think it's rather strange that a demonstration isn't shown where bytes are inserted into the protobuf field. |
On JSON support: how about borrowing some ideas from MySQL? Overview: |
I would really like to see functionality more akin to Postgres's JSON(B) support. For example: The ability to treat JSON first class with queries, indexing, etc. is extremely valuable. |
Thanks for your feedback. We are actively working on scoping out this feature, so please provide as much detail as you would like in terms of use case / requirements. cc @nstewart |
We're avid users of JSONB on Postgres and using SQLAlchemy as the lib to connect to it. We're exploring CockroachDB. It's great, but JSONB support is the one big missing thing that does not allow us to migrate. The good things in PG:
The not so good:
For example if you have an object such as:
In PG if you want to select the
For operators such as contains, or exists in PG they are Hope my input helps. |
JSONPath is becoming the standard for navigating JSON documents (similar to XPath for XML). Postgres has commits toward this which should be released in v11. This is the best option for CockroachDB to implement too so there's a unified access language. |
+1 for JSONPath! For simpler queries it would still be useful to have the usual JSON syntax. I would love to have something like: |
It may be a good idea to separate these features, as I would hope the implementations for JSON and Protobuf would look (perhaps dramatically) different. From a Protobuf user's perspective, I think we'd like to see a way to essentially create and update tables by passing a Protobuf schema definition to a create/alter table query. Whether or not data is stored on the backend as a protobuf is probably irrelevant, so long as its as compact as possible (we wouldn't be using protobuf if we weren't totally paranoid about bandwidth/storage.) I wouldn't want to see overhead introduced because of a choice to use a format designed to reduce overhead :) I say "dramatically different", because I would suspect the way CDB would approach creating the table would be. Since JSON is all about being dynamic, you'd want to allow for flexible schemas in a JSON table/field. Whereas with protocol buffers, you may want the entire table to be definable as a protocol buffer, not just one field. Although the difference between a protocol buffer field vs. a protocol buffer table probably is irrelevant, as long as you can provide a schema update. As for querying, I agree with the comments above that querying should be easy (I see no reason why JSONPath couldn't be used for PB) and fields need to be indexable, especially at arbitrary depths. |
What is the need for Protobuf in a relational database that already has data types and table schemas? This seems more like wanting a translation between Protobuf to SQL which can be done with an ORM, especially if you don't care about the backend storage itself. |
@manigandham I mean you can make the exact same argument about JSON. The idea is that given a schema defined externally to the DB (say, inside an application), it becomes very easy to delegate management of the database without having to provide SQL access or build an interfacing layer. If the data can be stored as a protocol buffer while we're at it while maintain the same feature set then I say by all means, but I suspect the storage impact of using PB in that case vs. a custom solution is probably very minimal. |
@manigandham I can also tell you first hand that trying to build a conversion layer between a flat schema (SQL) and schemas with depth is tedious to say the least. |
Postgres's JSONB support together with GIN indexing is wonderful for a lot of use cases involving mix of structured and unstructured data. However, like others mentioned, the JSON query semantics are slightly complicated and not trivial to reason about. There is a PG extension called JSQuery that tries to address many of the query limitations. I would love if CockroachDB could support the query/index capabilities of JSQuery, may be using a JSONPath interface. |
Team, I'd like to introduce myself. My name is Andy and I am a PM here at CRDB. I'm happy to share with you that we have heard you loud and clear and are working to support JSONB in 1.2. To that end we would love to solicit your comments on our current RFC scope proposal found here. In particular, we would love to hear more about how you plan to use JSONB and if we are adequately thinking about your use case. How do you want to use JSONB? Please feel free to share in the comments below or directly at the RFC. |
@awoods187 great news! What's the schedule for 1.2? |
@xarg 1.2 is planned for release in April 2018 |
I look forward to this feature! |
JSONB support will be in the 2.0 release in April, so I'm going to close this issue! |
@justinj Do you have any links to documentation on JSONB support that you can share publicly here? |
Hi @mofirouz, we don't have any documentation for JSONB yet, it will be ready by the time 2.0 releases in April. That said, it's very similar to Postgres' JSONB feature, and most of their docs should be applicable to its use in CockroachDB. You'll need to use a recent build of CockroachDB to have access to all the features in JSONB, though. |
Hi. It's work) Cockroach root@:26257/> SELECT * FROM bank.json root@:26257/> SELECT * FROM bank.json WHERE data ->> 'foo' > '0' ORDER BY data->>'foo' ASC LIMIT 1; root@:26257/> SELECT id,data->>'foo' AS count FROM bank.json WHERE data ->> 'foo' > '0' ORDER BY data->>'foo' ASC LIMIT 100; |
Support manipulation of JSON/Protobuf data. This might mean adding JSON/Protobuf column types. Or it might mean adding builtin functions to manipulate
TEXT
columns that are JSON/Protobuf encoded. Need to do a little exploration and design work first to figure out which is the right path to take.The text was updated successfully, but these errors were encountered: