You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I noticed that TumblThree app scan times are much higher than expected for blogs with duplicates and decided to look into this.
The TumblThree app seems to be sending a HTTP request to ".media.tumblr.com/" for each duplicate found, creating a large amount of additional HTTP requests. The initial json response "/api/read/json?debug=1&num=..." seems to have a unique file reference ID that could be pulled from "regular-body". Greatly reducing the number of requests needed to complete the scan and reducing the server load. You can replicate this by enabling "force rescan" and using any HTTP logger of your choice. This issue impacts rescan, reblogs, duplicates, etc and I think this would be useful for a lot of users. Sadly I don't have the coding background to fix this myself, which is why I am raising this issue.
To Reproduce
Steps to reproduce the behavior:
Setup HTTP monitoring or debug trace for TumblThree.
Start TumblThree with deduplication setting enabled and rescan an existing site that was already processed.
See the additional ".media.tumblr.com/" requests for files already in the index cache.
Expected behavior
Fast scan times with only the json file if content is duplicates.
Desktop (please complete the following information):
TumblThree version: v2.13
OS: Windows 10 Home
Browser: Chrome
Version 125
The text was updated successfully, but these errors were encountered:
Well, the missing information was that the already downloaded files were downloaded for another blog and not for the scanned one.
And the affected posts are those with embedded images, so the JSON structure isn't that helpful.
We'll change it to check not only the current blog but also all other blogs for duplicates in this case.
Describe the bug
I noticed that TumblThree app scan times are much higher than expected for blogs with duplicates and decided to look into this.
The TumblThree app seems to be sending a HTTP request to ".media.tumblr.com/" for each duplicate found, creating a large amount of additional HTTP requests. The initial json response "/api/read/json?debug=1&num=..." seems to have a unique file reference ID that could be pulled from "regular-body". Greatly reducing the number of requests needed to complete the scan and reducing the server load. You can replicate this by enabling "force rescan" and using any HTTP logger of your choice. This issue impacts rescan, reblogs, duplicates, etc and I think this would be useful for a lot of users. Sadly I don't have the coding background to fix this myself, which is why I am raising this issue.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Fast scan times with only the json file if content is duplicates.
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: