Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLServerBulkCSVFileRecord drops empty trailing columns when escapeDelimiters = true #1426

Closed
rripley opened this issue Sep 11, 2020 · 1 comment · Fixed by #1438
Closed

Comments

@rripley
Copy link

rripley commented Sep 11, 2020

Driver version

8.4.1

SQL Server version

Microsoft SQL Server 2016 (SP2-CU8) (KB4505830) - 13.0.5426.0 (X64)

Client Operating System

Windows 10

JAVA/JVM version

Amazon Corretto 11

Table schema

CREATE TABLE [exclu].[oig_leie](
	[LASTNAME] [varchar](20),
	[FIRSTNAME] [varchar](15),
	[MIDNAME] [varchar](15),
	[BUSNAME] [varchar](30),
	[GENERAL] [varchar](20),
	[SPECIALTY] [varchar](20),
	[UPIN] [varchar](6),
	[NPI] [varchar](10),
	[DOB] [date],
	[ADDRESS] [varchar](30),
	[CITY] [varchar](20),
	[STATE] [varchar](2),
	[ZIP] [varchar](5),
	[EXCLTYPE] [varchar](9),
	[EXCLDATE] [date],
	[REINDATE] [varchar](8),
	[WAIVERDATE] [varchar](8),
	[WVRSTATE] [varchar](2)
)

Problem description

  1. Expected behaviour: CSV with empty last column would load successfully when SQLServerBulkCSVFileRecord.setEscapeColumnDelimitersCSV(true).
  2. Actual behaviour: Exception thrown because number of columns in data does not match number of columns in header
  3. Error message/stack trace:
Exception in thread "main" com.microsoft.sqlserver.jdbc.SQLServerException: Unable to retrieve data from the source.
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeBatchData(SQLServerBulkCopy.java:3626)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.doInsertBulk(SQLServerBulkCopy.java:1566)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.access$300(SQLServerBulkCopy.java:64)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy$1InsertBulk.doExecute(SQLServerBulkCopy.java:662)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7375)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:3206)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.sendBulkLoadBCP(SQLServerBulkCopy.java:696)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeToServer(SQLServerBulkCopy.java:1654)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeToServer(SQLServerBulkCopy.java:619)
	... 2 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Source data does not match source schema.
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCSVFileRecord.getRowData(SQLServerBulkCSVFileRecord.java:267)
	at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeBatchData(SQLServerBulkCopy.java:3623)
	... 10 more
  1. Any other details that can be helpful:
    -1 needs to be added to the split invocation at line 200 and 247 of SQLServerBulkCSVFileRecord.java to mirror the split invocation when escapeDelimiters is false, otherwise blank columns at the end of the line are dropped. E.g., escapeQuotesRFC4180(currentLine.split(delimiter + escapeSplitPattern, -1))

Reproduction code

URL leieCSV = new URL("https://oig.hhs.gov/exclusions/downloadables/UPDATED.csv");
URLConnection leieCSVConnection = leieCSV.openConnection();

try(SQLServerBulkCopy bulkCopy = new SQLServerBulkCopy(conn);
    SQLServerBulkCSVFileRecord fileRecord = new SQLServerBulkCSVFileRecord(leieCSVConnection.getInputStream(), leieCSVConnection.getContentEncoding(), ",", true))
{
    fileRecord.setEscapeColumnDelimitersCSV(true);
    fileRecord.addColumnMetadata(1, "LASTNAME", java.sql.Types.VARCHAR, 20, 0);
    fileRecord.addColumnMetadata(2, "FIRSTNAME", java.sql.Types.VARCHAR, 15, 0);
    fileRecord.addColumnMetadata(3, "MIDNAME", java.sql.Types.VARCHAR, 15, 0);
    fileRecord.addColumnMetadata(4, "BUSNAME", java.sql.Types.VARCHAR, 30, 0);
    fileRecord.addColumnMetadata(5, "GENERAL", java.sql.Types.VARCHAR, 20, 0);
    fileRecord.addColumnMetadata(6, "SPECIALTY", java.sql.Types.VARCHAR, 20, 0);
    fileRecord.addColumnMetadata(7, "UPIN", java.sql.Types.VARCHAR, 6, 0);
    fileRecord.addColumnMetadata(8, "NPI", java.sql.Types.VARCHAR, 10, 0);
    fileRecord.addColumnMetadata(9, "DOB", java.sql.Types.DATE, 0, 0);
    fileRecord.addColumnMetadata(10, "ADDRESS", java.sql.Types.VARCHAR, 30, 0);
    fileRecord.addColumnMetadata(11, "CITY", java.sql.Types.VARCHAR, 20, 0);
    fileRecord.addColumnMetadata(12, "STATE", java.sql.Types.VARCHAR, 2, 0);
    fileRecord.addColumnMetadata(13, "ZIP", java.sql.Types.VARCHAR, 5, 0);
    fileRecord.addColumnMetadata(14, "EXCLTYPE", java.sql.Types.VARCHAR, 9, 0);
    fileRecord.addColumnMetadata(15, "EXCLDATE", java.sql.Types.DATE, 0, 0);
    fileRecord.addColumnMetadata(16, "REINDATE", java.sql.Types.VARCHAR, 8, 0);
    fileRecord.addColumnMetadata(17, "WAIVERDATE", java.sql.Types.VARCHAR, 8, 0);
    fileRecord.addColumnMetadata(18, "WVRSTATE", java.sql.Types.VARCHAR, 2, 0);
    
    stmt.executeUpdate("TRUNCATE TABLE exclu.oig_leie");
    bulkCopy.setDestinationTableName("exclu.oig_leie");
    bulkCopy.writeToServer(fileRecord);
}
@rene-ye
Copy link
Member

rene-ye commented Sep 11, 2020

Hi @rripley, thanks for the reprocode, we'll take a look and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants