-
Notifications
You must be signed in to change notification settings - Fork 80
cascading failure causing segfault in GtkReactive tests? #562
Comments
Is this a) reproducible and b) also happens in 1.6? I also have seen segfaults on 1.6-beta on an internal software of mine. A minimal reproducible example would be good to track this down. |
Those are the tests run by PkgEval. Note that tests for nightly have been been failing forever since we switched to Github Actions: https://github.com/JuliaGraphics/Gtk.jl/actions?query=workflow%3ACI |
Hm, yes. But nightly != 1.6 right? |
Yes, but I was referring to point 1, tests have been consistently failing for months. Maybe looking at them might help. Julia v1.6 isn't tested at the moment in CI (but you just need to add one line to the matrix) |
Hm, "Error: Could not find a Julia version that matches 1.6". Do you know what to enter there? 1.6-? |
tests passing locally on Julia 1.6rc1 |
At the moment it's |
seems unrelated to the segfault though. |
ok, its the same test failure as on nightly. |
I do not get why this is failing on Julia 1.6 (CI) but not not on 1.3 (CI) and also not locally on 1.6. |
Locally I get:
|
@giordano: Can you somehow identify if different Gtk binaries are used for different Julia versions? |
That's the fun part of running CI, isn't it? 😃 Note that you can SSH into the github actions runners for debugging with the step - name: Setup tmate session
uses: mxschmitt/action-tmate@v3
with:
limit-access-to-actor: true right before the run tests step (you can remove the last two lines if you don't want to use your ssh key to login). See https://github.com/marketplace/actions/debugging-with-tmate |
TIL, thanks @giordano! |
Ok, I have some findings and can even reproduce the issue locally. Here is my test program:
So the issue is with the calls where the vector of active elements is obtained. Sometimes (not always) I get the following run:
So it seems that something went wrong Gtk internally and then simply |
I do not get this error with Julia 1.5. |
So these sort of spontaneous errors are not the one I am capable of tracking down since this certainly touches the core of the type conversion in done in the glib module. The interesting question is what has changed from Julia 1.5 to 1.6 that can lead to these non-deterministic sort of errors. Ping @vtjnash who is much better prepared to solve this and ping @KristofferC to make you aware of this regression. |
@jonathanBieler, the RadioButtonGroup code doesn't use Line 64 in d743a8b
If I change the rhs to
all immediately upon calling EDIT: the Line 494 in b61cb23
and the Line 74 in d743a8b
|
OK, this diff gets us closer to the design suggested in https://juliagraphics.github.io/Gtk.jl/latest/manual/customWidgets/: diff --git a/src/buttons.jl b/src/buttons.jl
index 4574ee9..e19b533 100644
--- a/src/buttons.jl
+++ b/src/buttons.jl
@@ -59,9 +59,9 @@ mutable struct GtkRadioButtonGroup <: GtkContainer # NOT a native @gtktype
# the behavior is specified as undefined if the first
# element is moved to a new group
# do not rely on the current behavior, since it may change
- handle::GtkContainer
+ handle::Ptr{GObject}
anchor::GtkRadioButton
- GtkRadioButtonGroup(layout::GtkContainer) = new(layout)
+ GtkRadioButtonGroup(layout::GtkContainer) = gobject_move_ref(new(unsafe_convert(Ptr{GObject}, layout)), layout)
end
const GtkRadioButtonGroupLeaf = GtkRadioButtonGroup
macro GtkRadioButtonGroup(args...)
@@ -88,7 +88,7 @@ function push!(grp::GtkRadioButtonGroup, e::GtkRadioButton)
else
grp.anchor = e
end
- push!(grp.handle, e)
+ container_add!(grp.handle, e)
grp
end
function push!(grp::GtkRadioButtonGroup, label, active::Union{Bool, Nothing} = nothing)
@@ -100,7 +100,7 @@ function push!(grp::GtkRadioButtonGroup, label, active::Union{Bool, Nothing} = n
if isa(active, Bool)
gtk_toggle_button_set_active(e, active::Bool)
end
- push!(grp.handle, e)
+ container_add!(grp.handle, e)
grp
end
function start_(grp::GtkRadioButtonGroup)
diff --git a/src/container.jl b/src/container.jl
index a07ee41..b6609ee 100644
--- a/src/container.jl
+++ b/src/container.jl
@@ -1,9 +1,11 @@
+container_add!(w, child) = ccall((:gtk_container_add, libgtk), Nothing, (Ptr{GObject}, Ptr{GObject},), w, child)
+container_remove!(w, child) = ccall((:gtk_container_remove, libgtk), Nothing, (Ptr{GObject}, Ptr{GObject},), w, child)
function push!(w::GtkContainer, child)
- ccall((:gtk_container_add, libgtk), Nothing, (Ptr{GObject}, Ptr{GObject},), w, child)
+ container_add!(w, child)
w
end
function delete!(w::GtkContainer, child::GtkWidget)
- ccall((:gtk_container_remove, libgtk), Nothing, (Ptr{GObject}, Ptr{GObject},), w, child)
+ container_remove!(w, child)
w
end
function empty!(w::GtkContainer) Sometimes it succeeds, but I still get intermittent failures like
that seem to be reduced by adding various sleeps. What's the right way to do this? |
I think you might need a Line 115 in d743a8b
|
Ho ho! I hadn't noticed that, but that's a good catch. I added it everywhere |
I think you're seriously on to something, though: if I wrap Lines 314 to 342 in d743a8b
in |
There really does seem to be something different between 1.5 and 1.6. If you put GC.gc(true); GC.gc(true); GC.gc(true) right after the As disclosure, I've also commented out most steps that run prior to the "radio" tests, and I've also removed the diff --git a/test/gui.jl b/test/gui.jl
index 024af93..e5a9b81 100755
--- a/test/gui.jl
+++ b/test/gui.jl
@@ -2,7 +2,7 @@
using Gtk.ShortNames, Gtk.GConstants, Gtk.Graphics
import Gtk.deleteat!, Gtk.libgtk_version, Gtk.GtkToolbarStyle, Gtk.GtkFileChooserAction, Gtk.GtkResponseType
-
+#=
## for FileFilter
# This is just for testing, and be careful of garbage collection while using this
struct GtkFileFilterInfo
@@ -309,8 +309,8 @@ set_gtk_property!(check,:label,"new label")
showall(w)
destroy(w)
end
-
-@testset "radio" begin
+=#
+# @testset "radio" begin
choices = ["choice one", "choice two", "choice three", RadioButton("choice four"), Label("choice five")]
w = Window("Radio")
f = Gtk.GtkBox(:v); push!(w,f)
@@ -325,7 +325,9 @@ showall(w)
destroy(w)
r = RadioButtonGroup(choices,2)
+GC.gc(true); GC.gc(true); GC.gc(true)
@test length(r) == 5
+GC.enable(false)
@test sum([get_gtk_property(b,:active,Bool) for b in r]) == 1
itms = Vector{Any}(undef,length(r))
for (i,e) in enumerate(r)
@@ -335,12 +337,17 @@ for (i,e) in enumerate(r)
e[1]
end
end
+GC.enable(true)
@test setdiff(choices, itms) == [choices[4],]
@test setdiff(itms, choices) == ["choice four",]
+GC.enable(false)
@test get_gtk_property(get_gtk_property(r,:active),:label,AbstractString) == choices[2]
+GC.enable(true)
w = Window(r,"RadioGroup")|>showall
destroy(w)
-end
+# end
+# GC.enable(true)
+error("stop")
@testset "ToggleButton" begin
tb = ToggleButton("Off") |
Another observation: if I combine the above with disabling GC while the |
I am now onto other deadlines, but to leave this in a good state, here's a MWE: using Revise # curiously, this seems to be important to making this fully reproducible, though CI fails without
using Gtk, Gtk.ShortNames, Test
choices = ["choice one", "choice two", "choice three", RadioButton("choice four"), Label("choice five")]
r = RadioButtonGroup(choices,2)
GC.gc(true); GC.gc(true); GC.gc(true)
@test length(r) == 5
@test sum([get_gtk_property(b,:active,Bool) for b in r]) == 1 This reliably passes on 1.5 and reliably fails on 1.6. If you comment out the Both are running Gtk 1.1.6. |
I've opened an issue (see link above) that identifies JuliaLang/julia#38180 as the probable source of the change. What's really odd is that Gtk doesn't use |
That changed WeakRef so they must be re-set by a finalizer if it wants to keep them active. E.g. w = WeakRef(x)
finalizer(x) do x; keepalive(x) && w.value = x; end |
Sorry, I don't understand. I don't find |
I suppose I should have just said |
OK. It would be ideal if someone who understands memory management in Gtk (@jonathanBieler ?) fixes this here, and @vtjnash adds the requisite NEWS entry (JuliaLang/julia#39811 (comment)). If @jonathanBieler can't fix it I can tackle it, but I'd like to wait for @vtjnash's NEWS entry first since the enhanced clarity will save me time (which is in short supply). |
@timholy Sadly I don't understand much about memory management in Gtk.jl or Julia for that matter. |
I don't see an obvious reason it should behave different. Someone might want to run this under |
Sorry @jonathanBieler, I had you mixed up with with |
Unfortunately, every time I've tried to use
|
( I know this does not change anything, but a big "thank you" for investigating Tim ) |
https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2021-02/11/GtkReactive.1.7.0-DEV-50742c7e42.log
The text was updated successfully, but these errors were encountered: