-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with some tumblr videos #103
Comments
These are inline videos and support for these was added just earlier today (see #102). Make sure you are using the latest git snapshot and try again.
All videos that can't be downloaded are not playable when visiting the blog in a browser, either. |
@mikf I just updated the gallery-dl. In the first blog there are three different videos but here the gallery-dl is downloading only two. What is not being downloaded (https://abzcasal.tumblr.com/post/177483231154) is being skipped because I set it to not include reblogs. The problem is that this is a reblog of a post that apparently does not exist anymore (deleted?). So you could set up for gallery-dl when downloading from tumblr and find a reblog of a post from the same blog, check if the original post exists before skip the download? And about non-accessible videos, in which case would not it be better to suppress error messages? |
Yeah, that issue sounds familiar. Typical for the great Tumblr video purge. Videos still appear online but only return a 403. |
tested, all downloaded , sometime if you are not enable reblog, you will missing some post |
Setting 'reblogs' to "deleted" will check if the parent post of a reblog has been deleted and download its media content if that is the case, otherwise it will be skipped. This is a rather costly operation (1 API request per reblogged post) and should therefore be used with care.
@KaMyKaSii I've extended the functionality of the |
@mikf I have not yet tested the new version but it means that the reblogged media will be downloaded only if the original post is from the same blog and has been deleted, right? |
It currently doesn't check if it is from the same blog, but otherwise it should work like you described.
No, because you need to check if the original post still exists or not.
No, you aren't, it's sadly not that simple. All in all there are 4 values you need:
|
@mikf Honestly I did not want to have media from other blogs, could you configure to check if the original post is from the same blog that the user want to download? |
I think I've come up with another solution to your problem that doesn't need any extra API requests: gallery-dl -o reblogs=true --filter "not reblogged or reblogged_root_name == blog['name']" http://abzcasal.tumblr.com/ You enable reblogs (
And since duplicate image- and video-URLs get filtered out automatically, you would download any reblogged media only once, regardless if the original has been deleted or not. I also have a better implementation for |
@mikf I just ran the command and it worked perfectly, either with "reblogs": false or "reblogs": '"deleted". But anyway, for me it's great, so if you can do this better implementation, I thank you! |
- rename "deleted" to "same-blog" - change test for deleted original post to test if original post owner has the same UUID (full blog name) as the one being downloaded from - add 'blog[uuid]' metadata to allow comparison with 'reblogged_from_uuid'
So I changed things a bit and everything should now work as you wanted, I hope. The name changed from It now just checks if the original post is from the same owner as is being downloaded from. A check if the original post still exists isn't really necessary as explained above and it therefore doesn't need any extra API requests either. |
@mikf But then I believe there is a problem with the system that skips duplicates. I just ran the command "gallery-dl -i tumblrs.txt" (to download my favorite tumblr blogs) and downloaded duplicate content from various blogs. Maybe you can replicate using the "gallery-dl muyanna.tumblr.com" command with "reblogs": false and then the same command with "reblogs": "same-blog" |
You will get "duplicate" files if you run it twice with different For So either stick to one value for the |
@mikf First I'm sorry for taking so long to respond. But I was left with a doubt, duplicate content being downloaded with "reblogs": "same-blog" is expected behavior or something that should be fixed? Because even after deleting the folder from a tumblr blog and starting the download from zero, is what is happening. I believe you can replicate with the http://muyanna.tumblr.com blog, since I noticed that it always happens here. Download it today with the same settings as me. Wait for some time until his owner reblog his own posts and finally downloads the blog again. You will see that the new reblogs will be downloaded again, thus creating duplicate content |
That is expected behavior. gallery-dl can't really know that files from a new post are files it has downloaded before, but here are possible solutions you could try:
|
Some tumblr videos aren't detected (consequently not downloaded) and some are detected but aren't possible to download.
Example blog where videos are not being detected: http://abzcasal.tumblr.com
Example blog where some videos can't be downloaded: http://delicinhadele.tumblr.com
Both adult content, but I don't know if secure mode is enabled for these blogs (and if it influences something).
data:image/s3,"s3://crabby-images/1be58/1be5806a04c2162291e8c4e2286509de8657c115" alt="screenshot_20180906-150100"
data:image/s3,"s3://crabby-images/c67a6/c67a644bb34bccbd0bc2d1a1f864f657e9273db5" alt="screenshot_20180906-142945"
data:image/s3,"s3://crabby-images/25b3f/25b3fdc7bae66fb3a3e61fcd59ce37a7c15b887d" alt="img-20180906-wa0022"
Attached are screenshots of the problem and my tumblr extraction setup.
The text was updated successfully, but these errors were encountered: