Mass replay checking.

Started by namida, February 14, 2016, 09:16:16 AM

Previous topic - Next topic

0 Members and 3 Guests are viewing this topic.

namida

So - I finally implemented some form of mass replay checker!

I've uploaded an experimental release that contains this feature. Note: The experimental release does NOT include the fix to the climber bug mentioned here; so the gameplay physics are identical to V1.42n. It does include the fix to single levels not playing music.

To use it, hit F7 on the title screen, then select any replay. Every replay in the same folder as the one you select will be tested. Note that, especially for large packs and/or long levels, this is likely to be very slow. It's pretty kludgey for now, but it works.

It should work with the majority of replays, even older ones (I tested it with some replays that come from as far back as V1.27n-C). However, if a replay was made with a version older than V1.35n, then it will only be testable via the mass checker if it was made with a custom-made player (not with NeoCustLemmix or testplay mode). This is because in other cases, the replay doesn't offer much in the way of determining which level it's for, outside of brute-force testing it with every level in the pack (which I thought would be going too far).


Once it finishes, it will give you a quick report of how many replays fell into each of the possible categories; for more information, it will create a text file in the same folder as the NXP which will tell you which replays fell into each category (it also groups them by category). The possible responses are:

"PASSED": This means that the replay solved the level.
"UNDETERMINED": This means that the replay could not be conclusively determined as either a pass or a fail. See note below.
"FAILED": This means that all lemmings died (or the time limit ran out, if there was one) without the save requirement being reached.
"ERROR": This means an error occurred when either loading the level or loading the replay file.
"CANNOT FIND LEVEL": This means that NeoLemmix couldn't identify which level the replay was meant to be for.

For the record, "Undetermined" is triggered if, 5 minutes (game time) after the last replay action, there are still lemmings alive, but the save requirement has not been reached. I have yet to encounter a case where this happens on a replay that eventually passes the level, but it's theoretically possible, hence why I put it as a distinct outcome.

It will also state, for each replay, what level it tried it on (in some cases - for example, if you have multiple levels with the same level ID - it may end up trying the same replay on more than one level), and whether it identified the level by its level ID or by its position.

Give it a try - let me know if you encounter any crashes / issues / etc. I know there's a lot of room for improvement in it; but for now, let's make the main focus be on making sure it works, not making sure it's optimized. (However, I have added a couple of optimizations since then - I disabled the fadeout (unnessecary for automated testing) during the mass test mode, and added code to terminate playback as soon as the save requirement is reached.)

Known issues:
- If you run the mass checker twice without exiting NeoLemmix inbetween, it crashes.
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

Simon

#1
Tested the nightly build (uploaded 3 hours ago) with a small dir of 4 replays. Apart from mediocre performance, I have encountered no bugs on the normal run.

I tried to repro the crash on checking twice during program, by checking the 4 replays again. Didn't crash. Instead, the second time, the program did 2 iterations over the files. The file order in the second iteration was the same as during the first iteration. After the second iteration, the result msgbox popped up, proclaiming that 8 replays were successful. (There are 4 replays in the dir, and we made 2 passes.)

-- Simon

namida

Hm. Possibly a Windows vs WINE thing, or more likely, I introduced that bug somehow after I uploaded the experimental version. At any rate, I've fixed it now, although I didn't upload another experimental version yet - might do so later on today, after looking at a couple of the other recent bug reports.
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

Simon

Even slow automation is so much better than manual checking. This is a massive step forward. Pack designers can let the test run, take an invigorating shower, and will find results in a file later.

I am genuinely interested in Neolemmix's ecosystem quality. It's never been a competition. Good tooling encourages good content, with potential to reel in new community members. Good features of one engine serve as examples on how to improve another engine. This has gone in both directions, with framestepping and replay checking.

I still take pride in the fast Lix replay checker. We verify 900 replays in 3 minutes. ;-)

<IchoTolot> [Simon,] vote for auto replay checker ^^ You are showing me your tool all the time and it's so good!
<SimonN> >_>


Apart from work-intensive internal restructuring, there is no reason why NL couldn't become similarly fast.

-- Simon

IchoTolot

I've got 150 levels checked in ~ 3-4 mins. This time is 100% acceptable.

Of course if there is a way to optimise the time, it would become even more convenient ;)

Nepster

I tried the replay checker as well and everything worked really well.

I don't really care about the speed. Even if it would take ten times as long, it would be OK. However with every new replay, the replay checker catches the mouse again, so one cannot do anything else while the replay checker works. If there is an easy way to remove this, it would be appreciated.

Simon

#6
Icho's speed is almost 1-per-second. That's much faster than what I was experiencing on my old machine + Wine, 1 replay took over 5 seconds.

At Icho's speed, I agree that energy is best invested elsewhere than into pure performance.

Also, rambling: I read my previous post again, and it reads like I'm all full of myself, I'm so awesome because I crank out 10 times the quality of everybody else, everybody come and look. :E

Seriously, the NL checker does a good job already. I'm amazed that it's reached this state within a few hours after conception.

-- Simon

namida

#7
Quote from: Nepster on February 14, 2016, 09:40:00 PM
I tried the replay checker as well and everything worked really well.

I don't really care about the speed. Even if it would take ten times as long, it would be OK. However with every new replay, the replay checker catches the mouse again, so one cannot do anything else while the replay checker works. If there is an easy way to remove this, it would be appreciated.

Yeah, this was bugging me as well, but I didn't get around to fixing it yet. I will do... eventually.

(EDIT: And by "eventually", I mean I've just done it.)
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)