Jump to content
LaunchBox Community Forums

Scraper, The


garbanzo

Recommended Posts

I'm setting up a fresh config after a drive failure (ugh), and the scraper is frustrating me.

During a bulk import, any games that end in ", The", for example "Simpsons, The", don't get scraped.

But going back and scraping those same games individually grabs all info and media.

Why the extra step? Does the bulk import not use fuzzy matching? Can an option be added to ignore leading "The " when scraping to avoid all this extra work?

 

Another nitpick - when fetching media, I ALWAYS choose the same three items - box front, clear logo, and HD video. And yet I ALWAYS have to uncheck all and manually check these three items when scraping individual games. It would be nice if this preference could be saved, like it is for bulk imports, so I don't have to remind LB what media I want over and over and over while cleaning up all the games that were missed by Scraper, The.

 

End of rant. I'm mostly frustrated that I have to set everything up all over again and little things like this really slow the process down.

Link to comment
Share on other sites

49 minutes ago, garbanzo said:

I'm setting up a fresh config after a drive failure (ugh), and the scraper is frustrating me.

During a bulk import, any games that end in ", The", for example "Simpsons, The", don't get scraped.

But going back and scraping those same games individually grabs all info and media.

Why the extra step? Does the bulk import not use fuzzy matching? Can an option be added to ignore leading "The " when scraping to avoid all this extra work?

 

Another nitpick - when fetching media, I ALWAYS choose the same three items - box front, clear logo, and HD video. And yet I ALWAYS have to uncheck all and manually check these three items when scraping individual games. It would be nice if this preference could be saved, like it is for bulk imports, so I don't have to remind LB what media I want over and over and over while cleaning up all the games that were missed by Scraper, The.

 

End of rant. I'm mostly frustrated that I have to set everything up all over again and little things like this really slow the process down.

The importer does NOT use fuzzy matching, or if it does it is very very minimal. If it was less strict than it is you would have a ton of incorrect matches, which is obviously not good when you are importing a platform that could have thousands of roms. When you search for metadata from the edit game screen, that is a more fuzzy search.

So yes this is by design.

As to your specific example, those games imported just fine for me.

image.thumb.png.cbc91700f356a5814a431989e9b4d7a8.png

Link to comment
Share on other sites

Thanks for the explanation. It makes sense in theory - but in your experience, how often does the fuzzy matching of the individual game scraper actually lead to errors?

I have scraped at least 50 games individually that the bulk import missed this morning, and so far it hasn't come up with a single incorrect match. Based on this experience, it's hard for me to understand how some fuzzy matching in the bulk scraper would be problematic. I admit, however, that my experience is limited - I'm adding popular Sega and Nintendo systems, not obscure Japanese PC games.

Anyway, I'm surprised the scraper isn't trained to work perfectly with No-Intro naming conventions - I'm guessing that's a common denominator among many LaunchBox users...

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...