Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Question - JSON Support. For example QueryJSON PostreSQL support would be really great. #3895

Closed
joeblew99 opened this issue Jan 15, 2016 · 13 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Milestone

Comments

@joeblew99
Copy link

Wondering if support for JSON that matches PostreSQL is planned ?

This is a go library for PostreSQL, that supports all these advanced PostreSQL functions
https://github.com/mgutz/dat

@petermattis
Copy link
Collaborator

We've only had rudimentary thoughts about JSON support. Likely something we'll do, though there are no specific plans. What in particular about the PostgreSQL JSON support is attractive? AFAIK, the support is a set of additional functions, which isn't exactly seamless integration.

@petermattis
Copy link
Collaborator

Changefeeds or something like it is something that has come up before. It isn't on our near term roadmap, but we'd like to do something of that nature eventually.

@jhartman86
Copy link

+1 for some type of semi-structured data support down the road. @joeblew99 agreed RethinkDB is great (especially RQL and changefeeds), but its hard to say either Rethink or Mongo deliver horizontal scalability for nothing, especially Mongo w/ regards to administration and data integrity. RethinkDB is shaping up worlds better than Mongo, but it still has some (not insignificant) issues w/ sharding and scaling horizontally, although they're making great progress. That said, I don't think either of those will be able to match Cockroach in terms of stability/robustness as a distributed database, being built on the foundation of a KV store supporting fully ACID transactions.

All that said, if semi-structured data/JSON make it onto the relatively-near-term-roadmap post 1.0, it would be simply awesome.

@JackKrupansky
Copy link

Any examples of what you would expect to see for schema definition and
query for JSON in SQL?

How nested might your JSON be? Do you want that full nesting in a single
row, or multiple rows?

-- Jack Krupansky

On Tue, Feb 2, 2016 at 6:25 PM, Jon Hartman notifications@github.com
wrote:

+1 for some type of semi-structured data support down the road. @joeblew99
https://github.com/joeblew99 agreed RethinkDB is great (especially RQL
and changefeeds), but its hard to say either Rethink or Mongo deliver
horizontal scalability for nothing, especially Mongo w/ regards to
administration and data integrity. RethinkDB is shaping up worlds better
than Mongo, but it still has some (not insignificant) issues w/ sharding
and scaling horizontally, although they're making great progress. That
said, I don't think either of those will be able to match Cockroach in
terms of stability/robustness as a distributed database, being built on the
foundation of a KV store supporting fully ACID transactions.

All that said, if semi-structured data/JSON make it onto the
relatively-near-term-roadmap post 1.0, it would be simply awesome.


Reply to this email directly or view it on GitHub
#3895 (comment)
.

@jhartman86
Copy link

@JackKrupansky Gave this thread a +1 as in "pie in sky, it'd be great to have someday, but don't divert any mental effort from the great work being done towards beta". That said, I think "expect to see" is a hard one to answer as really the only precedent for JSON in tandem w/ a relational DB has been the handiwork of Postgres w/ the JSONB datatype. As a frame of reference, its all I've got (and like very much w/ Postgres). @petermattis is right I believe, in that the support comes through additional sql functions, although the QL is extended with non-standard query operators: ->, ->>, and @>, <@, ?, ?|, ?& via GIN indexes.

Regarding nesting, hard one to answer. It'd be easy to say "any standard JSON", so arbitrarily deep, but even if it were restricted to one level deep beyond the root it'd be immensely useful.

@JackKrupansky
Copy link

Just to be clear, I was not inquiring how any existing product implements
JSON support, but rather what people are trying to accomplish. For example,
are people trying to introduce JSON because they think it will be a better
way of organizing and accessing their data in the database itself, or
simply because they are trying to cope with the fact that somebody else has
already made a decision to use JSON and already created copious amounts of
data in JSON format without regard to how it might be indexed and queried
in a relational database context.

IOW, if your conceptual target is an SQL database you would tend to design
your JSON accordingly (relatively flat or arrays of flat), but if your
conceptual target is simply JSON itself then you would tend to focus on
more nested use of JSON that aren't going to fit so well in a relational
data model without a fair amount of restructuring.

That said, if anyone has actually been using an existing product that has
JSON support, some real examples would be good to share here. Again, I am
not asking what the existing features are, but for real-world examples of
how people have used those features - or have thought about using those
features. I can read doc too, so no need to try to do that for me. The
issue is decent examples, which most doc is very weak in.

-- Jack Krupansky

On Wed, Feb 3, 2016 at 4:16 AM, jow blew notifications@github.com wrote:

The way rethink does it is very simple.
There are two ways of querying a across json documents.
However, what is the most significant is the horizontal querying ability
bakes into their architecture. This works by having a proxy node between
you and the dB, which can do the query plan automatically to scale a fan in
/ fan out query pattern.
This is significant - zero changes to schema or code, and it all just
scales horizontally.

Now rethinkdb has acid scoped at document level only. Terrible if you need
transactional control.

Cockroachdb has a nice eventually consistent model.

The opportunity is obvious from here on in:
Simple documents that can be written to with eventual consistent
guarantees, and queried with horizontal scaling

On Wed, 3 Feb 2016, 02:08 Jon Hartman notifications@github.com wrote:

@JackKrupansky https://github.com/JackKrupansky Gave this thread a +1
as in "pie in sky, it'd be great to have someday, but don't divert any
mental effort from the great work being done towards beta". That said, I
think "expect to see" is a hard one to answer as really the only
precedent
for JSON in tandem w/ a relational DB has been the handiwork of Postgres
w/
the JSONB datatype. As a frame of reference, its all I've got (and like
very much w/ Postgres). @petermattis https://github.com/petermattis is
right I believe, in that the support comes through additional sql
functions, although the QL is extended with non-standard query operators
<
http://www.postgresql.org/docs/9.4/static/functions-json.html#FUNCTIONS-JSONB-OP-TABLE
:
->, ->>, and @>, <@, ?, ?|, ?& via GIN indexes.

Regarding nesting, hard one to answer. It'd be easy to say "any standard
JSON", so arbitrarily deep, but even if it were restricted to one level
deep beyond the root it'd be immensely useful.


Reply to this email directly or view it on GitHub
<
#3895 (comment)

.


Reply to this email directly or view it on GitHub
#3895 (comment)
.

@jhartman86
Copy link

@JackKrupansky Forgive the references to how others implement it, call it a misunderstanding in exactly what you were asking. For an answer to the use case: managing user-definable meta data without using EAV.

@JackKrupansky
Copy link

Thanks, that's helpful.

Personally, I'd like CockroachDB to have the equivalent of Cassandra's map
collection data type, which lets you have a single column containing
arbitrary key/value pairs. Storing those pairs as a map in a string of JSON
is a way to simulate that, but the keys and values cannot be directly
queryable.

-- Jack Krupansky

On Wed, Feb 3, 2016 at 11:05 AM, Jon Hartman notifications@github.com
wrote:

@JackKrupansky https://github.com/JackKrupansky Forgive the references
to how others implement it, call it a misunderstanding in exactly what you
were asking. For an answer to the use case: managing user-definable meta
data without using EAV.


Reply to this email directly or view it on GitHub
#3895 (comment)
.

@jhartman86
Copy link

@JackKrupansky Seconded. I don't want to latch onto JSON as the go-to "thing" necessarily. Naively, its what I'm familiar with as an analog to arbitrary KV pairs in a single column. I've never used PG's JSONB as a way to store anything beyond KV pairs anyways actually, which sounds like its aligned with how Cassandra's map collection data type is used.

@JackKrupansky
Copy link

And is all of that pure streaming - write-once, read occasionally, and
update-never?

-- Jack Krupansky

On Wed, Feb 3, 2016 at 12:48 PM, jow blew notifications@github.com wrote:

I might add that rethinkdb and json has worked well for this mostly
because its ease of horizontal scaling and changefeeds.

The only requirement is to do with operating environments. Yes it needs to
run in the data centre, but it also needs to run outside the data centre.
This is why golang works well for me - ISA chip architecture adnostic and
OS agnostic.


Reply to this email directly or view it on GitHub
#3895 (comment)
.

@petermattis petermattis added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Feb 12, 2016
@petermattis petermattis modified the milestone: 1.0 Feb 14, 2016
@petermattis petermattis changed the title Question - JSON Support. For example QueryJSON PostreSQL support would be really great. sql: Question - JSON Support. For example QueryJSON PostreSQL support would be really great. Mar 31, 2016
@tbg
Copy link
Member

tbg commented Jun 3, 2016

Closing in favor of #2969.

@alexander-manley
Copy link

Cassandra and RethinkDB have been mentioned, but also in Redis they have a implemented a concept of published messages that are characterized as channels, and they do not have knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more channels, and only receive messages that are of interest, without knowledge of what (if any) publishers there are. The Redis Pub/Sub implementation supports pattern matching and clients may subscribe to glob-style patterns in order to receive all the messages sent to channel names matching a given pattern, unlike the rather messy PostgreSQL setup for JSON changefeeds.

@ghost
Copy link

ghost commented Oct 18, 2016

Nats streaming now exists and provides a pretty good solution for the uni DAG style architecture talking about above.

https://github.com/nats-io/nats-streaming-server

How you parse your database logs and publish out to your upstream systems is up to you.
Mongo / meteor ) redux etc etc are all ways to do it.

Anyway Nate streaming is really worth a look !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

6 participants