sendStringParametersAsUnicode not considered for SQLServerBulkCopy? #1181

pbcornelius · 2019-11-08T16:23:09Z

Driver version

7.4.1

Problem description

Expected behaviour:
SQLServerBulkCopy should treat VARCHAR and LONGVARCHAR like LONGNVARCHAR and NVARCHAR if sendStringParametersAsUnicode = True; or, alternatively, provide a seperate configuration for this (if there is one, I apologise, but I could not find it here or here).
Actual behaviour:
It does not. As a result, unicode characters in VARCHAR columns are lost during transfer. This means one cannot transfer unicode from databases that do not distinguish between VARCHAR and NVARCHAR (such as H2).
Error message/stack trace:
None
Any other details that can be helpful:
A workaround is to wrap the ResultSetMetaData and manually swap VARCHAR for NVARCHAR, e.g.:

public class UnicodeResultSetMetaData implements ResultSetMetaData {

...

private static List<Integer> UNICODE_EQUIVALENT = List.of(Types.VARCHAR, Types.LONGVARCHAR);

private static List<String> UNICODE_EQUIVALENT_NAMES = List.of("VARCHAR",
		"CHARACTER VARYING",
		"LONGVARCHAR",
		"VARCHAR2",
		"VARCHAR_CASESENSITIVE");

public int getColumnType(int column) throws SQLException {
	int type = meta.getColumnType(column);
	if (UNICODE_EQUIVALENT.contains(type)) {
		return Types.NVARCHAR;
	} else {
		return type;
	}
}

public String getColumnTypeName(int column) throws SQLException {
	String type = meta.getColumnTypeName(column);
	if (UNICODE_EQUIVALENT_NAMES.contains(type)) {
		return "NVARCHAR";
	} else {
		return type;
	}
}

The text was updated successfully, but these errors were encountered:

peterbae · 2019-11-08T17:42:27Z

@pbcornelius thanks for letting us know, the team will look into this.

ulvii · 2019-11-15T01:35:04Z

Hi @pbcornelius ,
There are multiple ways of performing bulk copy operation. For me to get a better idea about your scenario, please provide standalone JDBC reproduction code and I will take a look into it. It is fine if your code involves H2 connection.

pbcornelius · 2019-11-15T07:32:09Z

Thanks for looking into this. Here's my code. It should reproduce that error with any H2 -> SQL Server 2017 non-UTF CP, VARCHAR to NVARCHAR transfer.

try (Connection con1 = DriverManager.getConnection("h2 url", user1, pw1);
		Connection con2 = DriverManager.getConnection("mssql url", user2, pw2);
		Statement stmt1 = con1.createStatement();
		Statement stmt2 = con2.createStatement();
		SQLServerBulkCopy bulkCopy = new SQLServerBulkCopy(con2)) {
	
	// source data
	CustomResultSet rs1 = new CustomResultSet("SELECT * FROM some_table");
	
	// destination table
	bulkCopy.setDestinationTableName("dest_table");
	
	// other options
	SQLServerBulkCopyOptions options = new SQLServerBulkCopyOptions();
	options.setKeepIdentity(true);
	options.setKeepNulls(true);
	options.setBatchSize(5000);
	options.setBulkCopyTimeout(0);
	bulkCopy.setBulkCopyOptions(options);
	
	// execute
	bulkCopy.writeToServer(rs1);
}

A workaround that's better than the one I posted above is to again wrap the ResultSetMetaData of the source table and use meta data from the destination table to overwrite which columns should be reported back as VARCHAR and which as NVARCHAR:

// destination table unicode columns? manually set ResultSet1 column types to NVARCHAR.
ResultSet rs2 = stmt2.executeQuery("SELECT TOP 0 * FROM dest_table");
ResultSetMetaData rs2Meta = rs2.getMetaData();
for (int i = 1; i <= rs2Meta.getColumnCount(); i++) {
	if (rs2Meta.getColumnType(i) == Types.NVARCHAR || rs2Meta.getColumnType(i) == Types.LONGNVARCHAR) {
		// assumes matching i across tables / can be easily changed to match on column names
		rs1.getMetaData().addUnicodeColumn(i);
	}
}

Where rs1.getMetaData() is a ResultSetMetaData wrapper with custom implementations:

private List<Integer> unicodeColumns = new ArrayList<>();

...

public void addUnicodeColumn(int column) {
	unicodeColumns.add(column);
}

public int getColumnType(int column) throws SQLException {
	if (unicodeColumns.contains(column)) {
		return Types.NVARCHAR;
	} else {
		return meta.getColumnType(column);
	}
}

public String getColumnTypeName(int column) throws SQLException {
	if (unicodeColumns.contains(column)) {
		return "NVARCHAR";
	} else {
		return meta.getColumnTypeName(column);
	}
}

ulvii · 2019-11-26T22:56:45Z

Hi @pbcornelius ,
I think I have a fix for your scenario. Please try the attached
jars and let me know if the issue is resolved. I will create a PR as soon as I complete testing.
issue1181.zip

pbcornelius · 2019-11-27T16:52:15Z

It seemed to work in a quick test with this as the source SELECT statement:
SELECT 'testabcဦ'

ဦ should be outside of the CP1252 scope.

I did not set sendStringParametersAsUnicode in the connection URL, so I guess that's the default behaviour now for bulk copy?

ulvii · 2019-11-27T17:42:57Z

Hi @pbcornelius ,
Thanks for confirming. Yes the behavoiur is by default, if the destination column is unicode, it makes sense to send the data as unicode. sendStringParametersAsUnicode is for PreparedStatement parameters, should not be used for bulk copy.

ulvii · 2019-12-06T22:51:40Z

#1193 merged

ulvii mentioned this issue Nov 27, 2019

Fix | Send bulk data as unicode when destination column is nchar/nvarchar #1193

Merged

ulvii closed this as completed Dec 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sendStringParametersAsUnicode not considered for SQLServerBulkCopy? #1181

sendStringParametersAsUnicode not considered for SQLServerBulkCopy? #1181

pbcornelius commented Nov 8, 2019

peterbae commented Nov 8, 2019

ulvii commented Nov 15, 2019

pbcornelius commented Nov 15, 2019 •

edited

Loading

ulvii commented Nov 26, 2019

pbcornelius commented Nov 27, 2019

ulvii commented Nov 27, 2019

ulvii commented Dec 6, 2019

sendStringParametersAsUnicode not considered for SQLServerBulkCopy? #1181

sendStringParametersAsUnicode not considered for SQLServerBulkCopy? #1181

Comments

pbcornelius commented Nov 8, 2019

Driver version

Problem description

peterbae commented Nov 8, 2019

ulvii commented Nov 15, 2019

pbcornelius commented Nov 15, 2019 • edited Loading

ulvii commented Nov 26, 2019

pbcornelius commented Nov 27, 2019

ulvii commented Nov 27, 2019

ulvii commented Dec 6, 2019

pbcornelius commented Nov 15, 2019 •

edited

Loading