Jump to content
LaunchBox Community Forums

Any chance of adding external scraper xml?


locvez

Recommended Posts

Hi, I currently use emulationstation and launchbox, depending on my mood, is it possible to integrate the following scrape tool or allow importing its xml? https://github.com/sselph/scraper This program allows the game to be listed with its actual file name (text in parenthesis is ignored in launch box) I like knowing if my game is (Europe) or (Japan) etc so having this as an option would be cool Thanks
Link to comment
Share on other sites

XML scraping has been brought up in the past, but the chance that this specific program will get integrated is potentially very low. Jason will have his own standards for how he wants to handle it if he has time. Once our own database is live anything else would really be very niche and really only satisfy a few people. It hasn't really been widely requested at this point. I'm not saying it wont happen, but the system we have now works well. Ignoring stuff between the ( ) and [ ] is intentional otherwise scraping can be broken and the more logic that gets added in like say "Ignore parentheses, import, now add back in Disc or Disk + numbers and region if applicable", will slow down the import process severely. A File Hash can also be wrong. It's different depending on who ripped it and how, if its patched, if translated etc.
Link to comment
Share on other sites

Yeah, fair point, but all the rooms I have in emulationstation have artwork and proper names. More than half of my launchbox list have missing artwork and all games from diff. Regions are named identically. I supported LB before even trying the free version purely because the BigBox mode looked so promising.... i get that the you want it 'your way' but if you want more, paying, people having interest in the product you have to listen to what your current paying customers are suggesting to make the product better. The hashed values are created by the community for every known rom for the supported systems so it is highly compatible with no-intro / goodrom / etc sets All the things I'm suggesting can be added as an option rather than a default setting.
Link to comment
Share on other sites

Who said Jason doesn't listen? That's all he does is listen and he has to wade through a lot of crap. Just because someone thinks they know whats good for something doesn't make it right. I also am not saying those last two lines towards you or anyone else specifically on the forums, you should just see some of the flaming e-mails he gets. I think he actually posted one of them on here. xD Again, I never said XML scraping wouldn't be a thing, just remember Jason is alone when it comes to creating this software. So things can sometimes take a while. There is a giant list of things to add and problems to fix. That was all. Also Jason loves the support he does get. I wouldn't donate a majority of my day to answering forum posts if I didn't either. The GamesDB isn't perfect, but it is slowly getting better over time. I had to upload several entries myself. It is why we want our own database however, and when its up I'll be able to upload all the images I have set, and so will the others. It will be fairly complete. A files hash can be wrong just like the spelling of a name could be wrong too. Granted, not as easily wrong but still possible. It all comes with trade offs. Edit: Oh and yea, I assumed that it would be optional. Never a question there.
Link to comment
Share on other sites

Hi @locvez, thanks for pointing that out to me; I hadn't seen Emulation Station's solution there. If nothing else, we could use some of the ideas to help improve the search. I'm pleased to see it's the MIT license though, so we could freely integrate it as-is, though I've never seen that "Go" language before.
Link to comment
Share on other sites

Thanks for taking the time to look Jason, I'm not sure how difficult it would be to make LB compatible with the xml produced by sselph scraper (which is a standalone scraper) the dev has collected vast amounts of rom file hashes from users and cross referenced this with thegamedb. It also allows the user to have the file name as the game name in the xml data which is displayed on emulationstation I just feel that this could be a small, optional, (premium?) change that will help LB appeal to more users. I love LB but use ES more purely because it displays more box art :( I don't like ES because the built in scraper sucks and the ui is nowhere near as good as LB.
Link to comment
Share on other sites

When I get home tonight I will try and remember to upload samples of the xml produced by the scraper and you can see what you think. The biggest benefit I think is that it actually contains a "TheGameDB" game ID as well as downloading the artwork and cross referencing the location of the artwork for the frontend.
Link to comment
Share on other sites

Ok, the XML for Megadrive game Sonic The Hedgehog is as follows :
<game id="114" source="theGamesDB.net"> <path>./S/Sonic The Hedgehog (USA, Europe).zip</path> <name>Sonic The Hedgehog (USA, Europe)</name> <desc>When Dr. Eggman was hatching his plans for a global takeover, there was one little thing he didn't count on - Sonic The Hedgehog! Our blue hero zips, flips, and spins through the levels at lightning speed to collect the Chaos Emerald and restore World Order.</desc> <image>./images/Sonic The Hedgehog (USA, Europe)-image.jpg</image> <rating>0.72222</rating> <releasedate>19910623T000000</releasedate> <developer>Sonic Team</developer> <publisher>Sega</publisher> <genre>Action</genre> <players>1</players> </game>
The image associated with this rom file is attached.Sonic-The-Hedgehog-USA-Europe-image.jpg If you wish I can upload the entire megadrive xml but it all follows the same idea. The scraper.exe file is executed from H:/roms/sega-megadrive/ The roms are in H:/roms/sega-megadrive/A/(all games beginning with A), H:/roms/sega-megadrive/B/(all games beginning with B), ETC, ETC
Link to comment
Share on other sites

locvez, Thanks for brining up the subject of scraper xml files. These certainly do a better job of matching games to TheGamesDB then when matching on name alone. While you are waiting for Jason to potentially add this functionality, I have built the ability to update your existing launch box ROMs with the information pulled from the scraper xml file into my LaunchBoxAnnotator program. I have tested it on my own collection and have been pleasantly surprised to see it fill in some of my metadata gaps. Also, I have noticed that once you use my tool to associate TheGamesDB Id with the LaunchBox games, LaunchBox's metadata downloader can get additional art for the games that isn't pulled by the scraper. Hope this helps you!
Link to comment
Share on other sites

Hi, Your app looks really nice it is just too bad I don't have a copy of windows to test it out. If you have any questions about my script feel free to email me. Go is a language created by Google and is fairly new but should be easy to read. It is possible to compile to a C/C++ library to link in but I don't have much experience doing that. I also wrote a version in C++ for ES(before I knew I could potentially compile to C++) here the GamesDBShaScraper and ROMHasher are the portions that you'd be interested in, but it is lagging behind since Aloshi hasn't updated ES in some time. If all you need to do is get the game ID for each file and you have the mechanisms to get the data yourself, It wouldn't be too hard to build a script from my libraries to push the ID into your DB assuming it is some standard DB type(sqlite, leveldb, file)
Link to comment
Share on other sites

Thank you @sselph! Really great to have you here. Great to hear about the C++ versions as well. @mathflair has already built an external tool that I believe uses the scraper, and just hooks into LaunchBox via XML. Here's that thread: https://www.launchbox-app.com/forum/features/launchboxannotator-tool#p6576 I believe there have been other related threads as well, but that's the new official one. I do eventually plan on building scraper hashing functionality directly into LaunchBox, so I really appreciate you chiming in. :)
Link to comment
Share on other sites

sselph said Your app looks really nice it is just too bad I don't have a copy of windows to test it out.
Hi Steve, Thanks so much for your scraper, it is a super usefull tool, especially for integrating into EmulationStation and hopefully now we can make LaunchBox even better! Thanks for implementing a small change I suggested as well, it was really appreciated! If you wish, I have a MSDN account with some serial numbers for Windows 7, 8.1 and 10 if you would like one? PM me if you do and I'll get a serial sorted out for you.
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...