Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite geopackage conversion #161

Open
wants to merge 83 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
5b725c6
Set up migration files
margrietpalm Sep 10, 2024
dfd92c3
wip - migration db schema
margrietpalm Sep 10, 2024
b41b159
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Sep 10, 2024
4fccf28
Wip: write migration
margrietpalm Sep 11, 2024
88cbcd9
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Sep 12, 2024
fbe8a31
Removed unused ZoomCategories
margrietpalm Sep 12, 2024
8616809
Fix mistake in migration related to unexpected columns in the source …
margrietpalm Sep 12, 2024
11cfc33
WIP: fix things I think
margrietpalm Sep 23, 2024
d72fbee
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Sep 30, 2024
c6a47f3
Fix things so all tests pass
margrietpalm Sep 30, 2024
08ccc87
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 1, 2024
53f78dd
Remove setting views on upgrading
margrietpalm Oct 1, 2024
5897b05
Use unique name for temp table
margrietpalm Oct 1, 2024
c37b897
Add tests for migration
margrietpalm Oct 1, 2024
115ceee
Use models.Material instead of local Material class
margrietpalm Oct 1, 2024
a805be3
Clean up and small fix
margrietpalm Oct 1, 2024
a1452c2
Ensure id is copied
margrietpalm Oct 2, 2024
330240e
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 2, 2024
f438d80
Rename some columns
margrietpalm Oct 2, 2024
3be33b0
rename manhole_indicator to visualisation
margrietpalm Oct 2, 2024
c406984
Add todo
margrietpalm Oct 3, 2024
85a208d
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 3, 2024
3128a59
Make cross_section_location.cross_section_width and cross_section_hei…
margrietpalm Oct 4, 2024
aef59cb
Extend CrossSectionShape to make it easier to check shape configuration
margrietpalm Oct 14, 2024
972a2d5
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 14, 2024
3e8fa69
Fix migration numbering
margrietpalm Oct 14, 2024
99ba21a
bump version for dev release
margrietpalm Oct 17, 2024
dc26680
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 21, 2024
7aae81f
Correct creating cross_section_table
margrietpalm Oct 22, 2024
6806269
Rename manhole_bottom_level to bottom_level
margrietpalm Oct 23, 2024
b2905b5
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Oct 30, 2024
07440a8
Change 227 in several names to 228
margrietpalm Oct 31, 2024
9639cea
Remove outdated TODO
margrietpalm Oct 31, 2024
0eadbbb
Merge branch 'margriet_89_schema_300_1D' of github.com:nens/threedi-s…
margrietpalm Oct 31, 2024
33841c9
update changes
margrietpalm Nov 4, 2024
9f57893
Merge branch 'master' into margriet_89_schema_300_1D
margrietpalm Nov 4, 2024
ac89605
Bump dev version
margrietpalm Nov 4, 2024
2c93d20
Fix changes typo
margrietpalm Nov 4, 2024
d7d16ce
Make ModelSettings.node_open_water_detection an Enum of type NodeOpen…
margrietpalm Nov 5, 2024
ce1709b
Remove nullable constraint from some columns
margrietpalm Nov 11, 2024
9603133
Correct names in StructureControlTypes
margrietpalm Nov 11, 2024
f97e95b
bump version
margrietpalm Nov 11, 2024
df0982e
Migrate material_id values 9 and 10
margrietpalm Nov 11, 2024
7681d49
Prevent conflicting table names from breaking migration
margrietpalm Nov 11, 2024
734d902
Prevent empty strings from being copied as text
margrietpalm Nov 11, 2024
9b782c4
Delete temp table and v2_manhole after migration
margrietpalm Nov 15, 2024
31de3ae
Add get_legacy_value to StructureControlTypes to retrieve the name us…
margrietpalm Nov 19, 2024
365e274
bump version
margrietpalm Nov 19, 2024
f1adfd6
Fix number in changes
margrietpalm Nov 20, 2024
e929462
Add empty migration
margrietpalm Nov 21, 2024
0f37a7a
Add header to changes
margrietpalm Nov 21, 2024
5af51d7
Merge branch 'master' into margriet_schema_300_leftovers
margrietpalm Nov 25, 2024
77f1cac
Remove leftover indices (#137)
margrietpalm Nov 26, 2024
287b6db
Remove left over references in geometry columns (#139)
margrietpalm Nov 26, 2024
aae9bfc
Change name of table tags to tag (#140)
margrietpalm Nov 26, 2024
d4b8bdb
Merge branch 'master' into margriet_schema_300_leftovers
margrietpalm Nov 26, 2024
e19b358
Make model_settings.use_2d_rain and model_settings.friction_averaging…
margrietpalm Nov 28, 2024
bad4638
Fix use tables (#145)
margrietpalm Dec 2, 2024
2d6eea9
Merge branch 'master' into margriet_schema_300_leftovers
margrietpalm Dec 4, 2024
2cf5c45
Merge branch 'master' into margriet_schema_300_leftovers
margrietpalm Dec 4, 2024
e38c1e9
Improve migration performance for 223 (#148)
margrietpalm Dec 4, 2024
5745c1d
Remove foreign key requirements that were missed before
margrietpalm Dec 5, 2024
3250ee8
switch surface dwf map geom direction (#151)
margrietpalm Dec 10, 2024
8cc717a
Sanitize comma separated fields (#152)
margrietpalm Dec 10, 2024
80fa723
Fix sanitize comma separated fields (#154)
margrietpalm Dec 10, 2024
bc199c9
Merge branch 'master' into margriet_schema_300_leftovers
margrietpalm Dec 10, 2024
36615fb
Remove view (#156)
margrietpalm Dec 17, 2024
81731fd
Remove usage of ORM from migrations 228 and 229 (#158)
margrietpalm Dec 17, 2024
f926ee8
Fix typo
margrietpalm Dec 17, 2024
88471a9
Bump versions
margrietpalm Dec 18, 2024
0972c05
use gdal.VectorTranslate instead of command-line ogr2ogr
elisalle Jan 2, 2025
e3853a5
remove ogr2ogr check
elisalle Jan 2, 2025
4713462
handle errors and warnings
elisalle Jan 2, 2025
737cc03
clean up layer conversion code, fix error handling
elisalle Jan 2, 2025
61cbc66
update comment
elisalle Jan 2, 2025
0f00e4c
update docstring
elisalle Jan 2, 2025
059310e
do not use newlie in f-string
elisalle Jan 2, 2025
d4d17c0
sort imports
elisalle Jan 2, 2025
4b75f22
run black
elisalle Jan 2, 2025
a7f0d07
remove unused imports
elisalle Jan 2, 2025
4534aaa
install gdal python bindings in test workflow
elisalle Jan 2, 2025
3ab5aa2
fix geopackage test
elisalle Jan 9, 2025
f5c8e1f
Merge branch 'master' into eli-rewrite-geopackage-conversion
elisalle Jan 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,23 +44,19 @@ jobs:
with:
python-version: ${{ matrix.python }}

- name: Install sqlite3 and spatialite
- name: Install GDAL, sqlite3 and spatialite
run: |
sudo apt update
sudo apt install --yes --no-install-recommends sqlite3 libsqlite3-mod-spatialite
sudo apt install --yes --no-install-recommends sqlite3 libsqlite3-mod-spatialite libgdal-dev gdal-bin
sqlite3 --version

- name: Install gdal (for ogr2ogr)
run: |
sudo apt update
sudo apt install --yes --no-install-recommends gdal-bin
ogr2ogr --version

- name: Install python dependencies
shell: bash
run: |
pip install --disable-pip-version-check --upgrade pip setuptools wheel
pip install ${{ matrix.pins }} .[test,cli]
pip install GDAL==$(gdal-config --version)
pip list

- name: Run tests
Expand Down
9 changes: 8 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,14 @@ Changelog of threedi-schema
0.229.1 (unreleased)
--------------------

- Nothing changed yet.
- Rename sqlite table "tags" to "tag"
- Remove indices referring to removed tables in previous migrations
- Make model_settings.use_2d_rain and model_settings.friction_averaging booleans
- Remove columns referencing v2 in geometry_column
- Ensure correct use_* values when matching tables have no data
- Use custom types for comma separated and table text fields to strip extra white space
- Correct direction of dwf and surface map
- Remove v2 related views from sqlite


0.229.0 (2025-01-08)
Expand Down
150 changes: 80 additions & 70 deletions threedi_schema/application/schema.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import re
import subprocess
import warnings
from pathlib import Path

Expand All @@ -10,6 +8,7 @@
from alembic.environment import EnvironmentContext
from alembic.migration import MigrationContext
from alembic.script import ScriptDirectory
from osgeo import gdal
from sqlalchemy import Column, Integer, MetaData, Table, text
from sqlalchemy.exc import IntegrityError

Expand Down Expand Up @@ -49,6 +48,13 @@ def _upgrade_database(db, revision="head", unsafe=True, progress_func=None):
alembic_command.upgrade(config, revision)


class GdalErrorHandler:
def __call__(self, err_level, err_no, err_msg):
self.err_level = err_level
self.err_no = err_no
self.err_msg = err_msg


class ModelSchema:
def __init__(self, threedi_db, declared_models=models.DECLARED_MODELS):
self.db = threedi_db
Expand Down Expand Up @@ -205,42 +211,22 @@ def upgrade_spatialite_version(self):

def convert_to_geopackage(self):
"""
Convert spatialite to geopackage using gdal's ogr2ogr.
Convert spatialite to geopackage using gdal.VectorTranslate.

Does nothing if the current database is already a geopackage.

Raises UpgradeFailedError if the conversion of spatialite to geopackage with ogr2ogr fails.
Raises UpgradeFailedError if the conversion of spatialite to geopackage with VectorTranslate fails.
"""

handler = GdalErrorHandler()
gdal.PushErrorHandler(handler)
gdal.UseExceptions()

warnings = []

if self.db.get_engine().dialect.name == "geopackage":
return
# Check if ogr2ogr
result = subprocess.run(
"ogr2ogr --version",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
## ogr2ogr is installed; make sure the version is high enough and return if not
if result.returncode == 0:
# get version
version = re.findall(r"\b(\d+\.\d+\.\d+)\b", result.stdout)[0]
# trim patch version and convert to float
float_version = float(version[0 : version.rfind(".")])
if float_version < 3.4:
warnings.warn(
f"ogr2ogr 3.4 (part of GDAL) or newer is needed to convert spatialite to geopackage "
f"but ogr2ogr {version} was found. {self.db.path} will not be converted"
f"to geopackage."
)
return
# ogr2ogr is not (properly) installed; return
elif result.returncode != 0:
warnings.warn(
f"ogr2ogr (part of GDAL) is needed to convert spatialite to geopackage but no working"
f"working installation was found:\n{result.stderr}"
)
return

# Ensure database is upgraded and views are recreated
self.upgrade()
self.validate_schema()
Expand All @@ -250,46 +236,70 @@ def convert_to_geopackage(self):
with work_db.get_session() as session:
session.execute(text("DROP TABLE IF EXISTS spatialite_history;"))
session.execute(text("DROP TABLE IF EXISTS views_geometry_columns;"))
cmd = [
"ogr2ogr",
"-skipfailures",
"-f",
"gpkg",
str(Path(self.db.path).with_suffix(".gpkg")),
str(work_db.path),
"-oo",
"LIST_ALL_TABLES=YES",
]
try:
p = subprocess.Popen(
cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=-1

all_tablenames = [model.__tablename__ for model in self.declared_models]
geometry_tablenames = (
session.execute(text("SELECT f_table_name FROM geometry_columns;"))
.scalars()
.all()
)
non_geometry_tablenames = [
name for name in all_tablenames if name not in geometry_tablenames
]

if (
session.execute(
text(
"SELECT count(*) FROM sqlite_master WHERE name='schema_version';"
)
).scalar()
> 0
):
non_geometry_tablenames.append("schema_version")

infile = str(work_db.path)
outfile = str(Path(self.db.path).with_suffix(".gpkg"))

conversion_list = []
conversion_list.append(
gdal.VectorTranslateOptions(
format="gpkg",
skipFailures=True,
)
except Exception as e:
raise UpgradeFailedError(f"ogr2ogr failed conversion:\n{e}")
_, out = p.communicate()
# Error handling
# convert bytes to utf and split lines
out_list = out.decode("utf-8").split("\n")
# collect only errors and remove 'ERROR #:'
errors = [
[idx, ": ".join(item.split(": ")[1:])]
for idx, item in enumerate(out_list)
if item.lower().startswith("error")
]
# While creating the geopackage with ogr2ogr an error occurs
# because ogr2ogr tries to create a table `sqlite_sequence`, which
# is reserved for internal use. The resulting database seems fine,
# so this specific error is ignored
# convert error output to list
expected_error = 'sqlite3_exec(CREATE TABLE "sqlite_sequence" ( "rowid" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "name" TEXT, "seq" TEXT)) failed: object name reserved for internal use: sqlite_sequence'
unexpected_error_indices = [
idx for idx, error in errors if error.lower() != expected_error.lower()
]
if len(unexpected_error_indices) > 0:
error_str = "\n".join(
[out_list[idx].decode("utf-8") for idx in unexpected_error_indices]
)
raise UpgradeFailedError(f"ogr2ogr didn't finish as expected:\n{error_str}")
for table in non_geometry_tablenames:
conversion_list.append(
gdal.VectorTranslateOptions(
format="gpkg",
accessMode="update",
SQLStatement=f"SELECT * FROM {table}",
layerName=table,
)
)
for conversion_options in conversion_list:
try:
ds = gdal.VectorTranslate(
destNameOrDestDS=outfile,
srcDS=infile,
options=conversion_options,
)
# dereference dataset before writing additional layers to ensure the data is written
del ds
except RuntimeError as err:
raise UpgradeFailedError from err
else:
if (
hasattr(handler, "err_level")
and handler.err_level >= gdal.CE_Warning
):
warnings.append(handler.err_msg)

if len(warnings) > 0:
warning_string = "\n".join(warnings)
raise UpgradeFailedError(
"GeoPackage conversion didn't finish as expected:\n", warning_string
)

# Correct path of current database
self.db.path = Path(self.db.path).with_suffix(".gpkg")
# Reset engine so new path is used on the next call of get_engine()
Expand Down
65 changes: 25 additions & 40 deletions threedi_schema/tests/test_gpkg.py
Original file line number Diff line number Diff line change
@@ -1,47 +1,32 @@
import re
import subprocess

import pytest

from sqlalchemy import text

from threedi_schema.application.schema import get_schema_version
from threedi_schema.infrastructure.spatialite_versions import get_spatialite_version



def test_convert_to_geopackage(oldest_sqlite):
if get_schema_version() < 300:
pytest.skip("Gpkg not supported for schema < 300")
@pytest.mark.parametrize("upgrade_spatialite", [True, False])
def test_convert_to_geopackage(oldest_sqlite, upgrade_spatialite):
# if get_schema_version() < 300:
# pytest.skip("Gpkg not supported for schema < 300")
# In case the fixture changes and refers to a geopackage,
# convert_to_geopackage will be ignored because the db is already a geopackage
assert oldest_sqlite.get_engine().dialect.name == "sqlite"
# check if ogr2ogr is installed and has the right version:
result = subprocess.run(
"ogr2ogr --version",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
expect_success = False
if result.returncode == 0:
# get version
version = re.findall(r"\b(\d+\.\d+\.\d+)\b", result.stdout)[0]
# trim patch version and convert to float
float_version = float(version[0 : version.rfind(".")])
if float_version >= 3.4:
expect_success = True
if expect_success:
oldest_sqlite.schema.convert_to_geopackage()
# Ensure that after the conversion the geopackage is used
assert oldest_sqlite.path.suffix == ".gpkg"
assert oldest_sqlite.get_engine().dialect.name == "geopackage"
assert oldest_sqlite.schema.validate_schema()
else:
# Upgrade is not performed in convert_to_geopackage when ogr2ogr doesn't run
# because no operation should be performed in that case.
# However, this wil result in an invalid schema, so here we will run upgrade manually.
# Because convert_to_geopackage() is only used via upgrade; this is not an issue.
with pytest.warns():
oldest_sqlite.schema.convert_to_geopackage()
oldest_sqlite.schema.upgrade()
assert oldest_sqlite.path.suffix == ".sqlite"
assert oldest_sqlite.get_engine().dialect.name == "sqlite"
assert oldest_sqlite.schema.validate_schema()

if upgrade_spatialite:
_, file_version = get_spatialite_version(oldest_sqlite)
assert file_version == 3
oldest_sqlite.schema.upgrade(revision="0229")
oldest_sqlite.schema.upgrade_spatialite_version()
_, file_version = get_spatialite_version(oldest_sqlite)
assert file_version >= 4

oldest_sqlite.schema.convert_to_geopackage()
# Ensure that after the conversion the geopackage is used
assert oldest_sqlite.path.suffix == ".gpkg"
with oldest_sqlite.session_scope() as session:
gpkg_table_exists = bool(session.execute(text("SELECT count(*) FROM sqlite_master WHERE type='table' AND name='gpkg_contents';")).scalar())

assert gpkg_table_exists
assert oldest_sqlite.schema.validate_schema()
Loading