I'm planning on making a tutorial series at some point in the future but the basic strategy I used for making the LOD's for today's test was as follows:
For LOD B select the whole model in blender and then use a combination of the functions 'tris to quads', 'un-subdivide', 'merge mesh by distance' and 'degenerate dissolve' to get LOD B. This ends up typically about 40% of the triangle count of LOD A with the same materials. Then for LOD C, divide everything visible on the exterior of the car into three categories, paint, glass or black, overwrite the materials for all the exterior objects with one of those three options (choose best fitting from existing materials so body_paint/ext_glass/black_plastic for example). Then join all these objects of the 3 materials together, clear the parent (while keeping transformation) and then delete all other items. You then end up with an external shell of only 3 objects which you can then use the LOD B methods to further reduce in triangle count. Of course all of that probably looks like a foreign language if you're not familiar with Blender. It would be relatively straight forward to adapt the method to add a third LOD for particularly unoptimised cars.
Results from left to right: LOD A = 191k Tri & 105 obj. LOD B = 75k Tri & 105 Obj. LOD C = 18k Tri & 3 Obj.
Yes it seems indeed like the object count is critical, particularly in the longer distance renders. My testing today would suggest that there's a small strain on the system for swapping the LODs in and out as well, this is the only thing I can think of anyway as to why in some instances the game runs better with no LODs than with the LOD generator results.