Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck on feature extraction in centOS [bug] #953

Closed
raphael2692 opened this issue Jun 17, 2020 · 12 comments
Closed

Stuck on feature extraction in centOS [bug] #953

raphael2692 opened this issue Jun 17, 2020 · 12 comments
Labels
bug for actual bugs (unsure? use type:question) stale for issues that becomes stale (no solution)

Comments

@raphael2692
Copy link

Hi all.

I've tried to run a reconstruction of a bunch of photos on a centOS 8.1 system (kernel Linux sd-01 4.18.0-193.6.3.el8_2.x86_64).

I managed to run the reconstruction via meshroom_photogrammetry but the process hangs on feature extraction using meshroom_compute or via meshroom GUI (default settings).

The features are correctly extracted (feature extraction logs reports "task done in x seconds" but the process hangs forver (no error message). I managed to force the process to continue by changing the value of the key 'status' in the '0.status' file to 'SUCCESS', however, I am interested in the automation of the process, so I can't manually change the file.

I have tried with the same dataset on windws and centOS 7 with no issues, so it seemes something specific to centOS 8.

@raphael2692 raphael2692 added the bug for actual bugs (unsure? use type:question) label Jun 17, 2020
@fabiencastan
Copy link
Member

Which version are you using?
Are you using a custom build or the latest release (2019.2)?
Could you test with the release or previous release?

@fabiencastan
Copy link
Member

fabiencastan commented Jun 17, 2020

If you manually change the status, do you get stuck at another stage? Or is the problem only on FeatureExtraction?

@simogasp
Copy link
Member

If you are using PopSift and a recent/own built AV develop it could be related to this alicevision/AliceVision#807 and alicevision/popsift#70

@raphael2692
Copy link
Author

Thanks for the reponse.

I am using the 2019.2 stable reslease with the binaries shipped with it.

If I manually change the status the pipe gets finished with no further issues.

Should I test 2019.1?

@simogasp
Copy link
Member

this is what was happening to me when we switch to boost 1.70 and that led me to propose alicevision/popsift#70 to better manage the release of the resources.

Does it hang even when you are using other types of features (e.g. CPU Sift (check "Force CPU extraction) or Akaze etc)?

@gryan315
Copy link

I had this happen to me in multiple linux distributions when unchecking "force CPU extraction." It is an odd bug, because it finishes processing the step, but then hangs there. If I would hit the stop button, then check "force CPU extraction" again, it instantly completes the FeatureExtraction node, because it was already computed.

@raphael2692
Copy link
Author

Setting "force_cpu" : true on Feature Extraction node the process doesn't hang anymore. However I really need to use GPU for the task. Is there anything I can do?

@NexTechAR-Scott
Copy link

Had same issue in Mint Cinnamon but not Debian 10.

@gryan315
Copy link

gryan315 commented Jun 21, 2020

@simogasp @fabiencastan I'm not sure if this is the exact same issue, but I just re-tested this on a dev build and got nearly identical results. The initial problem in the binary release which other linux users seem to also be seeing is: uncheck force cpu extraction > gpu reaches task done, but doesn't end (no green bar, no red bar), click stop button (red bar now because of stop), check force cpu extraction > task complete in 0.000 seconds and green bar.

In the dev build, it's about the same, but with some extra verbosity and a red bar error instead of just hanging in limbo:

[13:28:49.240190][info] Task done in (s): 43.084000
/home/gary/AliceVision/build/popsift/src/popsift/s_image.cu:261
    Could not destroy texture object: driver shutting down

If I recheck force cpu extraction after this error, it still does the task complete in 0.000 seconds and works fine. Also, I'm not sure why it's using that path for popsift when I have it "installed" in /usr/local/include. I get the same results on either the build I built myself, or the pkgbuild from the arch user repository.
Edit: Is this a CUDA problem?

@simogasp
Copy link
Member

yes it is a cuda problem and related to this alicevision/AliceVision#807

@simogasp
Copy link
Member

Normally if you try the latest develop that should disappear

@stale
Copy link

stale bot commented Oct 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale for issues that becomes stale (no solution) label Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug for actual bugs (unsure? use type:question) stale for issues that becomes stale (no solution)
Projects
None yet
Development

No branches or pull requests

5 participants