-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin downloading images for matching #17
Comments
I've gone back through and I believed we'd left it at Metron-Project/metron#164 (reply in thread)
Apologies if I misunderstood or didn't catch something afterwards. I did a dirty PR for using the hash which was rejected as larger work was required: comictagger/comictagger#491 How much investigation did you do because I'm interested if you've also gotten a lot more new accounts (as one aim was a wider use of not Comic Vine) or if there are some "power" users responsible for the vast majority of the usage? I'm surprised you don't have a Patreon or similar for Metron btw? If you want the cover matching switched off just say the word, it will obviously require a new release so will depend on people updating. I did try to get GCD to add cover hash as well so maybe another bump that too. GrandComicsDatabase/gcd-django#615 |
No worries, we probably just got our wires crossed.
I'm planning to parse the server logs to get a breakdown on the different clients that access the server and get users counts, since I figured you'd be interest in seeing just how many users are using your plugin, though it is a pretty low-priority task so I'm not sure when that will happen. On average we get 2-3 new accounts per day and maybe 1 of them will be using comictagger. Fairly often it seems when they initially start using it, they will re-tag their whole collection, which depending on the size can take several days, and then settle in and just tag new issues each week.
Setting up Open Collective for future funding is on the roadmap for 2024, but I think I need to get a 100 stars on the repo before qualifying so I'll need to encourage folks to do that since most of the time they will never visit the GitHub page.
I think it would be worthwhile to do that since out of all the clients that access the API, comictagger is by far the most resource hungry. On average, CT make 2-5 times the number of requests (depending of the series) to identify an issue compared to other clients, and CT is the only one that is downloading an image for every issue. I'd suggest using the
|
No worries, yeah just a general interest in the numbers. Driving people to Metron to help support it is one of the goals so, it's nice to know if it's working.
Hmm... yeah, re-tagging everything, that's a bit of a pain. I guess they like your data better :)
Completely unsolicited suggestion: The reason I mention Patreon is because most people who are supporters of things will already have an account and removing any friction might help. Adding a goal as well, say that if Metron gets x dollars for x months, the limit rate will be reduced and the like. It looks like Open Collective is for normal users as well though? Going for the users of the data will be a wider audience for backers. Most devs will probably not be making commercial apps. Also, do let me know when it goes active and I'm happy to change the about message to include the link to the page etc.
Sure thing. Are you okay with loading the images in the GUI still as that requires people clicking and should be a much lower volume?
The requests are tried to be kept to a minimum so I wonder what the difference is on the number of calls? (Rhetorical question but I guess metrontagger is one? If you want to throw some app names my way, most welcome.)
That is the plan, use the hash and then fallback to downloading the image. Some larger structural changes are needed but now it's more forefront because of this :) |
In my experience, most open source projects tend to use Open Collective (Komga, Kavita, Solus, Discord.py, journa.host, etc) over Patreon, since it has some benefits that Patreon doesn't offer like fiscal hosting. Could offer both as options, but I'm not sure it would be worth the extra work involved.
Not sure, since I've never used the GUI. When I get some free time I'll give it a look, so I understand how it's implemented.
Let's say you are searching a file, like
Using the same file with the ComicTagger CLI, the following requests are made to the API:
The primary reason CT makes almost 3x the number of API requests is because it's using the |
I would like, if at all possible to put a message along the lines of: "The image matching feature has been removed due to the large amount of data it consumes on the Metron servers, if you are able to help with the server costs please consider donating here (link)."
It shows a list of issues and when a user click on a different issue it loads the data (and fetches the images).
Thanks for that. I will look into making changes to do the same. |
I don't see any problem with that.
Did some testing today of the GUI. One thing I noticed, on the AutoTag window it appears to load the cover from the user's comic on the left side of the window and on the right side it loads an image downloaded from Metron, which seems to be a waste of time/space since it's immediately clears it when it moves on to the next issue to be identified. Another thing I noticed is that ComicTagger was only able to identify about 75% of the issues that Metron-Tagger was able to, I'm not sure why there was such a discrepancy (and I don't really feel like digging thru the various logs to find out), but here's a list of issues that failed if you're interested. Results of testing 149 issues:
By the way, where is the screenshot you added from? The 30 minutes or so I spent testing CT, I never came across it. Oh, and it might be worthwhile to convert this to a discussion. |
Only there is no link to send them to :) Maybe I can say something along the lines of "Donations accepted soon, check the website."?
As it has to download the image to generate the hash anyway, I think it's just a way to see that something is happening, iirc that same window is used for low confidence matches too for cover comparisons.
Thanks for the info, something to look into. It may be related to different covers etc.
That's via
Don't have them on atm. |
That would be fine, tho I don't have an ETA when I'll have the donation option available.
Did a quick glance at the files that couldn't be matched and my guess is Btw, if you need to know what Metron is receiving for your requests while working on this, just give me a shout and I can look at the server logs, tho it might be best to contact me on Matrix (if you use it) for a quick response. |
I have a PR ready to go #18 and I've changed the about text to:
Thanks for having a look. Sanitation can be a messy business (sorry, couldn't resist), if it helps or hinders.
I think the problem is the Were you okay with the covers being used in the issue list? (The user has to manually click on each issue after the first.) |
That looks fine to me.
Haven't had a chance to look at it yet, but I guess my questions would be:
|
It's a manual process so I would image people would only use it now and then but no metrics are sent from CT to know for sure. Unless someone has impressive action per minute, I would imagine it's low.
This would be for comparison reason I would guess. CT was created a long time ago so I can't say for sure. It was also probably to show the same information as the website. It can always be taken out at a later date if it's use proves to be a problem. What do other people (if they do) use it for? I know I asked for it to be included but maybe remove the link from the API until the donation system is in place and covers the expenses? |
As far as I'm aware you're the only in the I only asked why it was needed, since in most of the projects I've been involved with, we tended to write fairly detailed user stories on the work flow, information displayed, etc. needed before implementing it, and I still tend to think that way when looking at UI's. Anyway, if it's low impact window I don't see an issue, but I guess I'd say give it look to see if it's really needed or not. BTW, one other thing I thought of was it would make sense to change the user-agent on the plugin to something like |
Empty string is an option.
I don't know but I pretty confident that didn't happen :)
Visuals are always appealing. I do wonder if it should also show the local cover too...
Talkers don't have version numbers because they were built with CT (that might need to change now plugins can be loaded from a dir). There will need to be a new version of CT for Metron talker to be replaced (1.6.0a11 as it stands). |
That manual issue identification window is showing search results. The user can use the downloaded image to verify (through manual visual inspection) that the result matches the local comic's cover. If it's the wrong series (or volume) the user can back out and select another series. Also, back in the day it was not uncommon to have issue numbers in the filenames be wrong, so easily searching through issues in the series for the correct issue (and cover) was handy. The original UI for CT was modeled on the Comic Vine Scraper add-on for ComicRack and predates the automatic visual hash matching. |
I guess that why I'm asking if it's necessary, showing the image while also checking the hash, seem a bit like using a belt while wearing suspenders. Truthfully, I think it would be worthwhile to look at the GUI as a whole, since there seems like a lot of ways to improve it, performace-wise and also on the usability-front. |
Well, you're publishing to PyPi, so I mean they do have a version number, you're just not using it in the build tools. Truthfully, a version number for the plugin would be way more useful the knowing the CT version on my end. |
It's not needed while checking the hash, but that's a different window that runs during the GUI auto-tagging processes. That was definitely more a "watch the computer at work" bell-and-whistle, where the local cover is on the left, and the potential matches are cycled through on right as the auto-tagging is a work. Since CV allowed requests for thumbnails, it was relatively cheap to use. But yes, if it's a burden on the provider, the right side cover display should be disabled in that specific window. But the auto-tagging process, while very effective, is not bullet-proof, and can fail to match an issue for any number of reasons (poorly named file, a cover image that doesn't show up in the right order in the archive, etc). When the automatic process fails, the manual process become necessary. That involves one the windows like @mizaki shared above. In this case, there is no hash matching going on, just human eyeballs at work. I'd guess that the cover downloads for the manual issue identification window are not much more frequent than if someone was trying to browse to the website to verify that that a cover is correct. So, yes, belt-and-suspenders in general for the application, as ComicTagger was intended to be a tool for doing both automatic and manual tagging. |
In that scenario, I can definitely see downloading the cover since I'm guessing it's a fairly rare occurrence. Also, it might be worthwhile sometime down the line, that Metron offer thumbnails images. |
I received January's bill for Metron, and noticed the site's bandwidth usage has jumped fairly significantly which coincides with the increased usage of ComicTagger.
Did some investigating of the server logs and then tested CT locally and noticed you are downloading the cover to assist in matching, I thought we discussed not enabling that since unlike CV the server expenses are currently being covered by me and not a corporation. That was part of the reason why the Cover Hash info was added to Metron API to avoid unnecessary bandwidth usage for something that really shouldn't be needed.
I'm fine covering any addition cost for awhile, but if you want to continue using that feature we should probably talk about how to offset the additional cost.
The text was updated successfully, but these errors were encountered: