Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Npgsql: Use NpgsqlPoint .NET type for marshalling GEO_POINT types. Explore communicating and marshalling GeoJSON types. #782

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

amotl
Copy link
Member

@amotl amotl commented Dec 21, 2024

About

Explore CrateDB+Npgsql type support for geometry and geospatial types.

Status

  • NpgsqlPoint works well for GEO_POINT types. Thanks, @simonprickett.
  • Marshalling from/to .NET's GeoJSON types apparently needs to be done manually 1.

Trivia

Please note 762b488 includes some chore adjustments to remedy warnings, which are unrelated to the main topic of the patch.

Footnotes

  1. ... because GEO_SHAPE types are communicated as strings? I don't actually know the reason, but would like to investigate why it doesn't work fluently, optimally/optionally using the PostGIS/GeoJSON Type Plugin?

@amotl
Copy link
Member Author

amotl commented Dec 21, 2024

This patch offers a 70/30 test failure flakyness, where I don't know where it is coming from. It can be observed both on my machine, and on CI. In a worst case, it could be CrateDB itself, but we would need to investigate to find out. As with each of such occasion, there is a chance that a discovery of a subtle downstream flaw can help to improve CrateDB.

[xUnit.net 00:00:03.39]     demo.tests.DemoProgramTest.TestGeoJsonTypesExample [FAIL]
  Failed demo.tests.DemoProgramTest.TestGeoJsonTypesExample [173 ms]
  Error Message:
   Newtonsoft.Json.JsonReaderException : error parsing coordinates
---- Newtonsoft.Json.JsonReaderException : Unexpected character encountered while parsing value: [. Path 'coordinates', line 1, position 17.

   at demo.tests.DemoProgramTest.TestGeoJsonTypesExample() in /home/runner/work/cratedb-examples/cratedb-examples/by-language/csharp-npgsql/tests/DemoProgramTest.cs:line 222

https://github.com/crate/cratedb-examples/actions/runs/12444934016/job/34745706235?pr=782#step:7:105

@amotl amotl requested review from simonprickett and kneth December 21, 2024 12:14
@amotl amotl marked this pull request as ready for review December 21, 2024 12:14
Comment on lines +355 to +385
public async Task InsertGeoJsonTyped()
{
/***
* Verify Npgsql PostGIS/GeoJSON Type Plugin with CrateDB.
* https://www.npgsql.org/doc/types/geojson.html
*
* TODO: Does not work yet, because CrateDB communicates GEO_SHAPE as string?
* The error message is:
*
* System.NotSupportedException : The NpgsqlDbType 'Geometry' isn't present in your
* database. You may need to install an extension or upgrade to a newer version.
*/
Console.WriteLine("Running InsertGeo");

// Insert single data point.
await using (var cmd = new NpgsqlCommand("""
INSERT INTO testdrive.example (
"geoshape"
) VALUES (
@geoshape
);
""", conn))
{
var point = new Point(new Position(85.43, 66.23));
cmd.Parameters.AddWithValue("geoshape", NpgsqlDbType.Geometry, point);
cmd.ExecuteNonQuery();
}

// Flush data.
await RefreshTable();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be sweet to gain typed GeoJSON support, like the Npgsql PostGIS/GeoJSON Type Plugin might be providing it when talking to PostGIS. I don't know why it isn't working, the error message is:

System.NotSupportedException : The NpgsqlDbType 'Geometry' isn't present in your
database. You may need to install an extension or upgrade to a newer version.

Maybe it does not work, because CrateDB communicates GEO_SHAPE as exclusively as string when using the PostgrSQL wire protocol? Please advise if you see any options for improvements here.

/cc @seut, @surister, @kneth

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is PostGIS' Geometry equivalent to CrateDB's geo_shape? Would a simple alias in the server do the trick?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When talking about PostGIS, and looking at compatibility concerns, it is not just about types, but also, and mostly, about operations on them.

PostGIS unlocks GDAL, while CrateDB unlocks JTS. Those are technically different animals, while they are still living in the same habitat. In this spirit, I figure that a simple alias will probably not be applicable, even if it would also be my dearest wish.

This doesn't mean we should not explore this area closer, how we could provide downstream compatibility, or at least a reasonable feature parity, possibly by other means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a simple alias in the server do the trick?

Maybe @seut has more insights into that. I will be so happy to also learn more about those details, and if they have been parts of any discussions in the past already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the PostGIS docs, the generic geometry type standalone looks like it could be mapped to our GEO_SHAPE type as it serves as a generic type for all concrete spatial types the same like our geo_shape does. But as far as I read, the PostGIS also allows to specify a concrete geometry subset on table definition, e.g. geometry(LINESTRING). This isn't possible at CrateDB, we cannot limit the allowed concrete shape of a geometry value.
So we maybe could alias the PostGIS geometry type without a concrete spatial type definition, but even so we'd need to do some (extensive) testing to ensure that this works as expected (in terms of SQL and PG compatibility), especially, the query/filter behaviour.
I suggest to open a feature request at CrateDB to implement the geometry data type with the alias of geo_shape as a possible solution.

Copy link
Member Author

@amotl amotl Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seut: Thanks for your swift elaborations, and Happy New Year.
@kneth: Can I humbly ask you to follow @seut's advise and carry your proposal forward into a corresponding feature request ticket at crate/crate, already reusing @seut's statements on this topic to provide initial guidance?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl @seut I have created crate/crate#17187

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. 👍

Comment on lines +403 to +421
var point = new Point(new Position(85.43, 66.23));
var poly = new Polygon([
new LineString([
new Position(longitude: 5.0, latitude: 5.0),
new Position(longitude: 5.0, latitude: 10.0),
new Position(longitude: 10.0, latitude: 10.0),
new Position(longitude: 10.0, latitude: 5.0),
new Position(longitude: 5.0, latitude: 5.0),
])
]);
// TODO: Can GEO_SHAPE types be directly marshalled to a .NET GeoJSON type?
// Currently, `InsertGeoJsonTyped` does not work yet.
cmd.Parameters.AddWithValue("geoshape", NpgsqlDbType.Json, JsonConvert.SerializeObject(point));
cmd.ExecuteNonQuery();

cmd.Parameters.Clear();

cmd.Parameters.AddWithValue("geoshape", NpgsqlDbType.Json, JsonConvert.SerializeObject(poly));
cmd.ExecuteNonQuery();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What works is to communicate GeoJSON data using the NpgsqlDbType.Json type, but it needs manual marshalling like JsonConvert.SerializeObject(point).

Contrary to that, as mentioned above, the Npgsql PostGIS/GeoJSON Type Plugin enables to communicate .NET's GeoJSON types natively.

@simonprickett: Please let me know if you find any way do that already, which I might not have discovered yet. Thanks!

Comment on lines +438 to +448
// Query back data.
await using (var cmd = new NpgsqlCommand("SELECT * FROM testdrive.example", conn))
await using (var reader = cmd.ExecuteReader())
{
reader.Read();
// TODO: Can GEO_SHAPE types be directly marshalled to a .NET GeoJSON type?
// Currently, `InsertGeoJsonTyped` does not work yet.
var obj = reader.GetFieldValue<JsonDocument>("geoshape");
var geoJsonObject = JsonConvert.DeserializeObject<Point>(obj.RootElement.ToString());
return (Point?) geoJsonObject;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dito: Manual procedures are currently needed when working with .NET's native GeoJSON types.

Here, the code uses reader.GetFieldValue<JsonDocument> for retrieval, and JsonConvert.DeserializeObject<Point>(...) for unmarshalling and type casting.

Comment on lines +224 to +227
// Validate the outcome.
var coords = new Point(new Position(85.43, 66.23)).Coordinates;
Assert.Equal(coords.Latitude, point?.Coordinates.Latitude);
Assert.Equal(coords.Longitude, point?.Coordinates.Longitude);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't directly compare GeoJSON object instances. Need to run the comparison on the individual coordinates inside. 🤷

@amotl
Copy link
Member Author

amotl commented Jan 6, 2025

Do you have any objections to merge this patch, @simonprickett and @kneth?


// Enable PostGIS/GeoJSON Type Plugin.
// https://www.npgsql.org/doc/types/geojson.html
// dataSourceBuilder.UseGeoJson();
Copy link
Member

@seut seut Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this sending GEOJson (so a map encoded in JSON) to CrateDB?
If so, this could work when explicitly casting the insert value to a map/object:

insert into geom (geo) values('{"coordinates": [8.308903076149363, 47.05038385401457], "type": "Point"}'::object);

or

insert into geom (geo) select '{"coordinates": [8.308903076149363, 47.05038385401457], "type": "Point"}'::object;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions, here and below. I will investigate them and report back.


// GEO_SHAPE
// While `GEO_POINT` is transparently marshalled as `NpgsqlPoint`,
// `GEO_SHAPE` is communicated as scalar `string` type, using WKT or GeoJSON format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: CrateDB won't communicate this as string but as a JSON type, see also https://github.com/crate/crate/blob/master/server/src/main/java/io/crate/protocols/postgres/types/PGTypes.java#L66.

Maybe my previously commented workaround by casting the insert to an object will work?

Comment on lines +361 to +362
* TODO: Does not work yet, because CrateDB communicates GEO_SHAPE as string?
* The error message is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment about a possible workaround.

Comment on lines +445 to +447
var obj = reader.GetFieldValue<JsonDocument>("geoshape");
var geoJsonObject = JsonConvert.DeserializeObject<Point>(obj.RootElement.ToString());
return (Point?) geoJsonObject;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading data is not directly related to how data is ingested. Shouldn't the marshalling work when enabling GeoJSON by dataSourceBuilder.UseGeoJson();? Afaik the output JSON should be a valid GeoJSON format, at least for simple structures like Points.

@amotl amotl self-assigned this Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants