Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing errors #115

Merged
merged 3 commits into from
Feb 14, 2024
Merged

Fixing errors #115

merged 3 commits into from
Feb 14, 2024

Conversation

vhavlena
Copy link
Collaborator

@vhavlena vhavlena commented Jan 26, 2024

  • fixing inclusion graph assertion for the case w = w
  • instance caching in string_theory_propagation

@vhavlena vhavlena marked this pull request as ready for review January 30, 2024 12:09
Copy link
Collaborator

@Adda0 Adda0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me.

@@ -146,6 +146,10 @@ namespace smt::noodler {
STRACE("str", tout << __LINE__ << " enter " << __FUNCTION__ << std::endl;);
STRACE("str", tout << mk_pp(expr, get_manager()) << std::endl;);

if (propgated_string_theory.contains(expr))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in propgated_string_theory?

@jurajsic
Copy link
Member

jurajsic commented Jan 31, 2024

We give sat instead of unsat for QF_SLIA/20190311-str-small-rw-Noetzli/str-pred-small-rw/str-pred-small-rw_719.smt2.
On debug, it gives unknown, but this is because of different cases handled first, it gets to some cases with not contains (decision procedure gives l_undef).
If I change lines 896-910 in theory_str_noodler.cpp to

            } else { //if (result == l_false) {
                // we did not find a solution (with satisfiable length constraints)
                // we need to block current assignment
                STRACE("str", tout << "assignment unsat " << mk_pp(block_len, m) << std::endl;);
                if(m.is_false(block_len)) {
                    block_curr_len(block_len, false, true);
                } else {
                    block_curr_len(block_len);
                }
                return FC_CONTINUE;
            // } else {
            //     // we could not decide if there is solution, let's just give up
            //     STRACE("str", tout << "giving up" << std::endl);
            //     return FC_GIVEUP;
            }

I can force it to handle not contains as unsat, and then it gets to the case that returns sat, there might be some problem replace and contains handling.

@jurajsic
Copy link
Member

jurajsic commented Feb 9, 2024

The impact of these changes (still with the bug):

Benchmark sygus_qgen/to120.csv
# of formulae: 343
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f   343     0   0.01   0.01    4.95    343        0          0     0         0        0
z3-noodler-34ea542-2cddb2f   343     0   0.01   0.01    5.08    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang/to120.csv
# of formulae: 999
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f   999     0   0.04   0.02   43.27    186      813          0     0         0        0
z3-noodler-34ea542-2cddb2f   999     0   0.04   0.02   43.16    186      813          0     0         0        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_2461/4044844237.py:37: DtypeWarning: Columns (51,69,73,77,85,87,89,93,95,97,99) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark automatark/to120.csv
# of formulae: 15995
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  15935    60   0.12   0.02  1977.69  10452     5483          0    60         0        0
z3-noodler-34ea542-2cddb2f  15935    60   0.12   0.02  1980.21  10452     5483          0    60         0        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_2461/4044844237.py:37: DtypeWarning: Columns (35,43,47,57,59,61,63,65,139,149,151,153,155,157,159,161) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark stringfuzz/to120.csv
# of formulae: 11618
--------------------------------------------------
tool                           ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  10574  1044   0.03   0.01  269.24   6087     4487       1044     0         0        0
z3-noodler-34ea542-2cddb2f  10574  1044   0.03   0.01  267.22   6087     4487          0     0         0     1044
--------------------------------------------------

Benchmark norn/to120.csv
# of formulae: 1027
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  1027     0   0.01   0.01   14.92    712      315          0     0         0        0
z3-noodler-34ea542-2cddb2f  1027     0   0.01   0.01   14.34    712      315          0     0         0        0
--------------------------------------------------

Benchmark slog/to120.csv
# of formulae: 1976
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  1976     0   0.01   0.01   28.80    808     1168          0     0         0        0
z3-noodler-34ea542-2cddb2f  1976     0   0.01   0.01   29.14    808     1168          0     0         0        0
--------------------------------------------------

Benchmark slent/to120.csv
# of formulae: 1128
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  1121     7   0.02   0.01   21.51    627      494          1     6         0        0
z3-noodler-34ea542-2cddb2f  1126     2   0.02   0.01   27.15    629      497          1     1         0        0
--------------------------------------------------

Benchmark transducer_plus/to120.csv
# of formulae: 91
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f     0    91    nan    nan    0.00      0        0         91     0         0        0
z3-noodler-34ea542-2cddb2f     0    91    nan    nan    0.00      0        0          0     0         0       91
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_2461/4044844237.py:37: DtypeWarning: Columns (1,3,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,157,159,161,163,165) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark kaluza/to120.csv
# of formulae: 19432
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  19171   261   0.04   0.01   682.31  16077     3094          0   259         2        0
z3-noodler-34ea542-2cddb2f  19175   257   0.08   0.01  1452.05  16074     3101          0   256         1        0
--------------------------------------------------

Benchmark webapp/to120.csv
# of formulae: 681
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f   355   326   0.09   0.02   32.29    132      223        316     6         4        0
z3-noodler-34ea542-2cddb2f   358   323   0.44   0.02  158.53    132      226          0     3         4      316
--------------------------------------------------

Benchmark kepler/to120.csv
# of formulae: 587
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f   584     3   0.03   0.01   16.94    284      300          0     3         0        0
z3-noodler-34ea542-2cddb2f   584     3   0.03   0.01   16.90    284      300          0     3         0        0
--------------------------------------------------

Benchmark woorpje/to120.csv
# of formulae: 809
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f   731    78   0.35   0.02  256.78    573      158          0    23        55        0
z3-noodler-34ea542-2cddb2f   732    77   0.35   0.02  252.57    573      159          0    24        53        0
--------------------------------------------------

Benchmark leetcode/to120.csv
# of formulae: 2652
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  2649     3   0.05   0.01  128.61    865     1784          0     0         3        0
z3-noodler-34ea542-2cddb2f  2648     4   0.05   0.01  119.43    864     1784          0     1         1        2
--------------------------------------------------

Benchmark str_small_rw/to120.csv
# of formulae: 1880
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  1730   150   0.14   0.01  238.60      5     1725        103    30        17        0
z3-noodler-34ea542-2cddb2f  1731   149   0.10   0.01  172.72      4     1727        100    25        13       11
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_2461/4044844237.py:37: DtypeWarning: Columns (1,3,5,7,9,11,13,15,17,19,21,23,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,65,69,71,73,75,79,81,83,85,87,89,91,93,95,97,103,105,107,109,111,113,127,131) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark full_str_int/to120.csv
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  14504  2464   0.10   0.01  1384.64    757    13747       1607   822        35        0
z3-noodler-34ea542-2cddb2f  14504  2464   0.11   0.01  1541.23    757    13747          0   801        43     1620
--------------------------------------------------

Benchmark pyex/to120.csv
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1c65ad6-2cddb2f  23742   103   0.25   0.10  5846.12  20378     3364          0    99         4        0
z3-noodler-34ea542-2cddb2f  23747    98   0.24   0.11  5769.71  20376     3371          0    95         3        0
--------------------------------------------------

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@jurajsic
Copy link
Member

Current situation:

Benchmark sygus_qgen/to120.csv
# of formulae: 343
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f   343     0   0.01   0.01    5.10    343        0          0     0         0        0
cvc5-1.0.8                   343     0   0.27   0.10   92.87    343        0          0     0         0        0
z3-4.12.2                    343     0   0.06   0.05   20.06    343        0          0     0         0        0
cvc5-1.1.1                   343     0   0.23   0.04   79.96    343        0          0     0         0        0
z3-4.12.5                    343     0   0.03   0.02   10.59    343        0          0     0         0        0
z3-noodler-c69c980-2cddb2f   343     0   0.01   0.01    5.11    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang/to120.csv
# of formulae: 999
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f   999     0   0.04   0.02   44.48    186      813          0     0         0        0
cvc5-1.0.8                   981    18   0.46   0.02  452.30    169      812          0    18         0        0
z3-4.12.2                    881   118   0.45   0.13  398.98    186      695          0   118         0        0
cvc5-1.1.1                   981    18   0.43   0.00  418.42    169      812          0    18         0        0
z3-4.12.5                    883   116   0.39   0.08  341.13    186      697          0   105        11        0
z3-noodler-c69c980-2cddb2f   999     0   0.04   0.02   44.81    186      813          0     0         0        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_33890/4044844237.py:37: DtypeWarning: Columns (51,69,73,77,85,87,89,93,95,97,99) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark automatark/to120.csv
# of formulae: 15995
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  15935    60   0.13   0.02  2013.72  10452     5483          0    60         0        0
cvc5-1.0.8                  15901    94   0.17   0.01  2639.32  10459     5442          0    85         9        0
z3-4.12.2                   15882   113   0.31   0.05  4845.79  10471     5411          0   112         1        0
cvc5-1.1.1                  15899    96   0.15   0.00  2418.43  10459     5440          0    88         8        0
z3-4.12.5                   15869   126   0.40   0.02  6343.72  10468     5401          0   125         1        0
z3-noodler-c69c980-2cddb2f  15935    60   0.12   0.02  1968.40  10452     5483          0    60         0        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_33890/4044844237.py:37: DtypeWarning: Columns (35,43,47,57,59,61,63,65,139,149,151,153,155,157,159,161,163,165,167) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark stringfuzz/to120.csv
# of formulae: 11618
--------------------------------------------------
tool                           ✅    ❌    avg    med      time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  --------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  11572    46   0.03   0.01    315.04   6349     5223         42     4         0        0
cvc5-1.0.8                  10552  1066   2.61   0.02  27503.53   5777     4775          0  1065         1        0
z3-4.12.2                   11232   386   4.14   0.04  46483.27   6148     5084          0   367        19        0
cvc5-1.1.1                  10522  1096   2.65   0.00  27893.23   5756     4766          0  1095         1        0
z3-4.12.5                   11072   546   4.09   0.01  45283.01   5986     5086          0   421       125        0
z3-noodler-c69c980-2cddb2f  11572    46   0.03   0.01    313.64   6349     5223         42     4         0        0
--------------------------------------------------

Benchmark norn/to120.csv
# of formulae: 1027
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  1027     0   0.01   0.01   14.91    712      315          0     0         0        0
cvc5-1.0.8                   942    85   0.25   0.03  236.08    705      237          0    85         0        0
z3-4.12.2                    903   124   0.19   0.05  168.88    712      191          0   124         0        0
cvc5-1.1.1                   943    84   0.29   0.01  274.24    705      238          0    84         0        0
z3-4.12.5                    904   123   0.32   0.02  293.44    712      192          0   123         0        0
z3-noodler-c69c980-2cddb2f  1027     0   0.01   0.01   15.13    712      315          0     0         0        0
--------------------------------------------------

Benchmark slog/to120.csv
# of formulae: 1976
--------------------------------------------------
tool                          ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  1976     0   0.01   0.01    29.62    808     1168          0     0         0        0
cvc5-1.0.8                  1976     0   0.01   0.01    18.27    808     1168          0     0         0        0
z3-4.12.2                   1905    71   0.58   0.03  1109.01    737     1168          0    71         0        0
cvc5-1.1.1                  1976     0   0.00   0.00     1.00    808     1168          0     0         0        0
z3-4.12.5                   1943    33   0.12   0.01   224.64    775     1168          0    33         0        0
z3-noodler-c69c980-2cddb2f  1976     0   0.01   0.01    29.41    808     1168          0     0         0        0
--------------------------------------------------

Benchmark slent/to120.csv
# of formulae: 1128
--------------------------------------------------
tool                          ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  1125     3   0.02   0.01    25.52    629      496          1     2         0        0
cvc5-1.0.8                  1106    22   0.87   0.01   965.39    618      488          0    22         0        0
z3-4.12.2                   1054    74   0.38   0.03   401.39    573      481          0    74         0        0
cvc5-1.1.1                  1104    24   0.98   0.00  1080.27    618      486          0    24         0        0
z3-4.12.5                   1053    75   0.29   0.01   300.43    569      484          0    75         0        0
z3-noodler-c69c980-2cddb2f  1126     2   0.02   0.01    27.58    629      497          1     1         0        0
--------------------------------------------------

Benchmark transducer_plus/to120.csv
# of formulae: 91
--------------------------------------------------
tool                          ✅    ❌     avg     med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f     0    91  nan     nan       0.00      0        0         91     0         0        0
cvc5-1.0.8                    42    49   13.34    3.17  560.14     41        1          0    49         0        0
z3-4.12.2                      1    90    0.04    0.04    0.04      0        1         90     0         0        0
cvc5-1.1.1                    41    50   15.49    3.71  635.07     40        1          0    50         0        0
z3-4.12.5                      1    90    0.01    0.01    0.01      0        1         90     0         0        0
z3-noodler-c69c980-2cddb2f     0    91  nan     nan       0.00      0        0         91     0         0        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_33890/4044844237.py:37: DtypeWarning: Columns (1,3,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,157,159,161,163,165,167,169,171) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark kaluza/to120.csv
# of formulae: 19432
--------------------------------------------------
tool                           ✅    ❌    avg    med     time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  -------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  19172   260   0.05   0.01  1015.37  16076     3096          0   259         1        0
cvc5-1.0.8                  19432     0   0.06   0.01  1253.63  16303     3129          0     0         0        0
z3-4.12.2                   19268   164   0.11   0.02  2095.90  16145     3123          0   164         0        0
cvc5-1.1.1                  19432     0   0.06   0.00  1122.86  16303     3129          0     0         0        0
z3-4.12.5                   19147   285   0.10   0.01  1938.97  16031     3116          0   285         0        0
z3-noodler-c69c980-2cddb2f  19173   259   0.04   0.01   728.57  16077     3096          0   258         1        0
--------------------------------------------------

Benchmark webapp/to120.csv
# of formulae: 681
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f   358   323   0.49   0.02  176.71    132      226        316     3         4        0
cvc5-1.0.8                   588    93   1.36   0.05  800.32    165      423          0    93         0        0
z3-4.12.2                    457   224   0.71   0.04  323.64     42      415        123   101         0        0
cvc5-1.1.1                   587    94   1.35   0.01  791.79    164      423          0    94         0        0
z3-4.12.5                    447   234   0.46   0.02  203.56     33      414        119   115         0        0
z3-noodler-c69c980-2cddb2f   358   323   0.46   0.02  165.78    132      226        316     3         4        0
--------------------------------------------------

Benchmark kepler/to120.csv
# of formulae: 587
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f   584     3   0.03   0.01   17.29    284      300          0     3         0        0
cvc5-1.0.8                   347   240   0.01   0.01    4.86     50      297          0   240         0        0
z3-4.12.2                    274   313   0.06   0.04   17.09    131      143          0   313         0        0
cvc5-1.1.1                   347   240   0.00   0.00    0.65     50      297          0   240         0        0
z3-4.12.5                    278   309   0.09   0.02   25.70    135      143          0   284        25        0
z3-noodler-c69c980-2cddb2f   584     3   0.03   0.01   17.22    284      300          0     3         0        0
--------------------------------------------------

Benchmark woorpje/to120.csv
# of formulae: 809
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f   734    75   0.39   0.02  282.84    575      159          0    22        53        0
cvc5-1.0.8                   755    54   0.11   0.03   83.22    591      164          0    54         0        0
z3-4.12.2                    784    25   0.52   0.04  405.28    620      164          0    25         0        0
cvc5-1.1.1                   755    54   0.06   0.01   44.90    591      164          0    54         0        0
z3-4.12.5                    782    27   0.27   0.02  214.78    618      164          0    27         0        0
z3-noodler-c69c980-2cddb2f   735    74   0.36   0.02  263.18    576      159          0    19        55        0
--------------------------------------------------

Benchmark leetcode/to120.csv
# of formulae: 2652
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  2650     2   0.02   0.01   61.74    866     1784          0     0         2        0
cvc5-1.0.8                  2652     0   0.03   0.02   87.84    867     1785          0     0         0        0
z3-4.12.2                   2652     0   0.04   0.02   99.66    867     1785          0     0         0        0
cvc5-1.1.1                  2652     0   0.01   0.00   32.58    867     1785          0     0         0        0
z3-4.12.5                   2652     0   0.02   0.01   43.00    867     1785          0     0         0        0
z3-noodler-c69c980-2cddb2f  2647     5   0.02   0.01   60.82    863     1784          2     0         3        0
--------------------------------------------------

Benchmark str_small_rw/to120.csv
# of formulae: 1880
--------------------------------------------------
tool                          ✅    ❌    avg    med    time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -----  -----  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  1743   137   0.25   0.01  427.59      4     1739        101    25        11        0
cvc5-1.0.8                  1861    19   0.03   0.01   48.26     22     1839          2    17         0        0
z3-4.12.2                   1817    63   0.06   0.03  112.09     25     1792          0    63         0        0
cvc5-1.1.1                  1861    19   0.02   0.00   28.48     20     1841          2    17         0        0
z3-4.12.5                   1819    61   0.07   0.01  132.27     25     1794          0    61         0        0
z3-noodler-c69c980-2cddb2f  1733   147   0.22   0.01  373.44      4     1729        110    24        13        0
--------------------------------------------------

/var/folders/f1/d3j3qhzn3lg8qg7r1x_2vxnm0000gn/T/ipykernel_33890/4044844237.py:37: DtypeWarning: Columns (1,3,5,7,9,11,13,15,17,19,21,23,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,65,69,71,73,75,79,81,83,85,87,89,91,93,95,97,103,105,107,109,111,113,127,131) have mixed types. Specify dtype option on import or set low_memory=False.
Benchmark full_str_int/to120.csv
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌    avg    med      time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  --------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  14557  2411   0.13   0.01   1903.28    810    13747       1199  1051       161        0
cvc5-1.0.8                  16968     0   0.31   0.03   5258.46   2526    14442          0     0         0        0
z3-4.12.2                   16734   234   1.45   0.04  24231.84   2478    14256          0   234         0        0
cvc5-1.1.1                  16963     5   0.27   0.01   4657.42   2522    14441          0     5         0        0
z3-4.12.5                   16727   241   1.21   0.01  20316.56   2471    14256          0   241         0        0
z3-noodler-c69c980-2cddb2f  14555  2413   0.14   0.01   2103.69    808    13747       1201  1064       148        0
--------------------------------------------------

Benchmark pyex/to120.csv
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌    avg    med      time    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -----  -----  --------  -----  -------  ---------  ----  --------  -------
z3-noodler-1174fd1-2cddb2f  23751    94   0.26   0.10   6152.39  20380     3371          0    93         1        0
cvc5-1.0.8                  23811    34   0.32   0.13   7665.50  20392     3419          0    33         1        0
z3-4.12.2                   22774  1071   2.77   0.24  63069.90  19503     3271          0  1071         0        0
cvc5-1.1.1                  23802    43   0.21   0.06   5103.25  20383     3419          0    43         0        0
z3-4.12.5                   22839  1006   2.42   0.12  55195.90  19568     3271          0  1006         0        0
z3-noodler-c69c980-2cddb2f  23747    98   0.26   0.11   6219.00  20376     3371          0    95         3        0
--------------------------------------------------

@jurajsic
Copy link
Member

  • there are no incorrect results
  • the fixes in the asserts for the inclusion graphs helped with those few cases of str_small_rw
  • the optimization with instance caching seems like it has no impact, but @vhavlena said that it stops some looping in full_str_int, even if the instances still lead to TO/MO, so we should keep it

@jurajsic
Copy link
Member

Also:
image

@vhavlena vhavlena merged commit 50a0930 into devel Feb 14, 2024
@vhavlena vhavlena deleted the asserts-fixing branch February 14, 2024 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants