Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Cancellation Hangs When Using Multi Subnet Failover #144

Closed
sehrope opened this issue Feb 11, 2017 · 2 comments
Closed

Query Cancellation Hangs When Using Multi Subnet Failover #144

sehrope opened this issue Feb 11, 2017 · 2 comments
Assignees
Labels
Under Review Used for pull requests under review

Comments

@sehrope
Copy link
Contributor

sehrope commented Feb 11, 2017

Normally we use the standard Statement.cancel() method to cancel running queries. If it's called on a connection made with multiSubnetFailover=true, it has no effect and both the original Thread executing the Statement and the Thread issuing the cancel() will hang until the statement completes.

This issue been tested and confirmed on the latest stable SQL Server JDBC driver as well as a number of older drivers (prior to the 6.x branch).

See below for a self-contained example of this. Replace the JDBC url with any recent version of SQL Server and flip the boolean multiSubnetFailover property between true/false to see the impact. The sample code issues a dummy WAITFOR statement to simulate a slow query and then tries to cancel it in a separate thread. If multiSubnetFailover is enabled, the cancellation never occurs as the canceler thread blocks waiting to write to the nio Channel (probably trying to write the ATTENTION cancellation packet).

Stack trace for thread: stmt-canceller

java.nio.channels.Channels.writeFully(Channels.java:96)
java.nio.channels.Channels.access$000(Channels.java:61)
java.nio.channels.Channels$1.write(Channels.java:174)
com.microsoft.sqlserver.jdbc.TDSChannel$ProxyOutputStream.writeInternal(IOBuffer.java:1059)
com.microsoft.sqlserver.jdbc.TDSChannel$ProxyOutputStream.write(IOBuffer.java:1051)
sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:876)
sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:847)
sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
com.microsoft.sqlserver.jdbc.TDSChannel.write(IOBuffer.java:1887)
com.microsoft.sqlserver.jdbc.TDSWriter.flush(IOBuffer.java:4285)
com.microsoft.sqlserver.jdbc.TDSWriter.writePacket(IOBuffer.java:4186)
com.microsoft.sqlserver.jdbc.TDSWriter.endMessage(IOBuffer.java:3192)
com.microsoft.sqlserver.jdbc.TDSWriter.sendAttention(IOBuffer.java:4164)
com.microsoft.sqlserver.jdbc.TDSCommand.interrupt(IOBuffer.java:7647)
com.microsoft.sqlserver.jdbc.SQLServerStatement.cancel(SQLServerStatement.java:1193)
Main$2.run(Main.java:64)

Stack trace for thread: main

sun.nio.ch.FileDispatcherImpl.read0(Native Method)
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
sun.nio.ch.IOUtil.read(IOUtil.java:197)
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:192)
sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
com.microsoft.sqlserver.jdbc.TDSChannel$ProxyInputStream.readInternal(IOBuffer.java:949)
com.microsoft.sqlserver.jdbc.TDSChannel$ProxyInputStream.read(IOBuffer.java:937)
sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
sun.security.ssl.InputRecord.read(InputRecord.java:503)
sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1871)
com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6602)
com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7919)
com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:879)
com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:778)
com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7465)
com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2271)
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:191)
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:166)
com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStatement.java:751)
Main.main(Main.java:77)

Stepping through a debugger and comparing the stacks with multiSubnetFailover enabled/disabled, we noticed that the underlying SocketInputStream is a different class. When the feature is disabled, it's a java.net.SocketInputStream. When it's enabled it's a SocketAdaptor.SocketInputStream.

If I had to guess, I'd say that whatever the JDK is doing to spoof a Socket interface atop a Channel isn't as concurrent as a true Socket. It's a probably a shared lock acquired by a reader on the channel blocking the writer. These two lines look suspicious as they're both synchronizing on the same object:

https://github.com/openjdk-mirror/jdk/blob/jdk8u/jdk8u/master/src/share/classes/sun/nio/ch/SocketAdaptor.java#L188

https://github.com/openjdk-mirror/jdk/blob/jdk8u/jdk8u/master/src/share/classes/java/nio/channels/Channels.java#L95

While this could be a bug in the JDK, I'm not sure that non-blocking between readers/writers is part of the API contract for Sockets or at least Sockets derived from Channels. As such, this might be something you'd need to have the driver handle. The answer might be to switch to using real Sockets rather than Channels even when multiSubnetFailover is enabled.


Sample Program Reproduce Issue - QueryCancelTest.java

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.time.LocalDateTime;
import java.util.Properties;

public class QueryCancelTest {
  private static final long CANCEL_WAIT_MILLIS = 2000;

  private static void log(String format, Object... args) {
    String timestamp = LocalDateTime.now().toString();
    String threadName = Thread.currentThread().getName();
    System.out.printf(timestamp + " [" + threadName + "] - " + format + "\n", args);
  }

  private static void dumpStackTrace(Thread thread) {
    StringBuilder sb = new StringBuilder();
    sb.append("Stack trace for thread: " + thread.getName() + "\n");
    for (StackTraceElement elem : thread.getStackTrace()) {
      sb.append("    " + elem.toString() + "\n");
    }
    System.err.println(sb.toString());
  }

  private static void sleepQuietly(long millis) {
    try {
      Thread.sleep(millis);
    } catch (InterruptedException e) {
      throw new RuntimeException(e);
    }
  }

  private static void delayPrintStackTrace(final Thread thread) {
    new Thread() {
      public void run() {
        sleepQuietly(2000);
        dumpStackTrace(thread);
      }
    }.start();
  }

  public static void main(String args[]) {
    String host = "db.example.com";
    int port = 1433;
    String databaseName = "test";
    String user = "myuser";
    String password = "mypass";
    boolean multiSubnetFailover = true;

    String url = "jdbc:sqlserver://" + host + ":" + port;
    Properties props = new Properties();
    props.put("databaseName", databaseName);
    props.put("user", user);
    props.put("password", password);
    if (multiSubnetFailover) {
      props.put("multiSubnetFailover", "true");
    }

    try {
      try (Connection conn = DriverManager.getConnection(url, props)) {
        log("Created Connection");
        try (final Statement stmt = conn.createStatement()) {
          log("Created Statement");

          final Thread mainThread = Thread.currentThread();
          final Thread cancellationThread = new Thread() {
            public void run() {
              log("Started");
              try {
                log("Waiting for %s ms for statement to begin", CANCEL_WAIT_MILLIS);
                Thread.sleep(CANCEL_WAIT_MILLIS);
                log("Finished waiting, will attempt to cancel statement");
                delayPrintStackTrace(this);
                delayPrintStackTrace(mainThread);
                log("Before stmt.cancel()");
                stmt.cancel();
                log("After stmt.cancel()");
              } catch (Exception e) {
                e.printStackTrace();
              }
              log("Finished");
            }
          };
          cancellationThread.setName("stmt-canceller");
          cancellationThread.start();

          try {
            log("Before stmt.execute(...)");
            boolean hasResultSet = stmt.execute("WAITFOR DELAY '00:00:15'");
            log("After stmt.execute(...) hasResultSet=%s", hasResultSet);
          } catch (SQLException e) {
            e.printStackTrace();
          }
        }
      }
      log("Finished successfully");
    } catch (Exception e) {
      e.printStackTrace();
      System.exit(1);
    }
    System.exit(0);
  }
}
@v-nisidh v-nisidh added the Under Review Used for pull requests under review label Feb 14, 2017
@xiangyushawn
Copy link
Contributor

@sehrope thank you very much for filing this issue. We really appreciate the very detailed description and the sample code you gave. We will take a look and keep you updated. Thanks.

@xiangyushawn
Copy link
Contributor

hello @sehrope we have created PR to handle this issue. Please let us know if it works for you. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Under Review Used for pull requests under review
Projects
None yet
Development

No branches or pull requests

3 participants