I have done my own testing of LaunchBox's behavior. I want to make a couple of corrections and additions to Zombeaver's original post.
Here are the four ways to match a media file to a game:
1. Match game title to the media filename.
The file extension is ignored
Upper case vs. lower case is ignored
In the filename, all illegal filename characters from the title (as well as ' single quote) must be replaced by _ (underscore). A title of "Ultima III: Exodus" matches a filename of "Ultime III_ Exodus.jpg".
If there is a run of multiple consecutive illegal characters in the title, they must be replaced by a single underscore in the filename. So a title of "Game/:?" matches a filename of "Game_.jpg".
The full list of illegal characters to be replaced by underscore is < > : " / \ | ? * '
A hyphen followed by a number can be appended to the end of the filename. Any number of digits works. So a title of "Doom" matches "Doom-1.jpg", "Doom-11.jpg", "Doom-111.jpg", etc.
2. Match ROM/application filename to the media filename.
Upper case vs. lower case is ignored
Otherwise, the filenames must match exactly, except for:
A hyphen followed by a number can be appended to the end of the filename. Any number of digits works. So a ROM file of "Doom.zip" matches "Doom-1.jpg", "Doom-11.jpg", "Doom-111.jpg", etc.
3. Match stripped game title to stripped media filename.
NOTE: In Zombeaver's post, he said that the stripped ROM filename can match the stripped media filename. This is not true. It is the stripped title, not ROM filename that works this way.
Upper case vs. lower case is ignored
In this matching mode, all of the following are ignored (i.e. removed from the strings before the string comparison is done):
Whitespace is ignored
These special characters are ignored: ! - . & ' , : " / \ ?
Matched parentheses/brackets/braces and their contents are ignored. (), [], {}, (a), [b], {c} will all be ignored. However, single unmatched parentheses/brackets/braces will not be ignored.
The words "a", "an", "and", and "the" are ignored. However, Launchbox's behavior with these is weird. If there are two consecutive ands ("and and") or thes ("the the"), etc. it doesn't ignore them. I don't fully understand Launchbox's logic here. Also, there may be other "magic words" that get ignored, but I haven't found any.
In this mode, illegal characters in the title are not replaced by underscore. This has the consequence that titles containing these illegal characters <>|* cannot use this matching mode.
However, underscores in the title must have a corresponding underscore in the media filename.
Unlike matching modes 1 and 2, the hyphen followed by a number at the end of the filename is not allowed.
Example: A title of "My Strangely_(named) ??Game and, the" will match these filenames: "MyStrangely_Game.jpg", "My Strangely_(named) Game and the.jpg", "My Strangely_ Game (USA) [b] and.jpg", "My Strangely_Game &the.jpg", "TheMyStrangely_GameAnd.jpg", etc.
4. Match game ID to media filename.
If the media filename contains the game ID (anywhere; it can be preceded and followed by other stuff in the filename), it matches the game.
This is the long local GUID that loos like "46f2a757-d3f3-4540-939f-02f978bbb76e", not the LaunchBox database ID.
This is all I have found in my testing. There are probably some more quirks in the third matching mode, but these rules should cover the vast majority of cases.