Lemmix

Started by EricLang, June 23, 2011, 08:15:02 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

EricLang

Thanks for the post. I'll have to read it a few times...

Hmm... there you got something. If the level changes an existing replay file will not work anymore.
However *not* storing the hashkey will make things really slow. Suppose I have a levelpack with 120 levels. I don't want to load them all in memory just to generate a hashkey of it, when searching the right level for a replay file.

Simon

That seems like best practice indeed -- you can store the hash of each level in the pack file, to speed up the hashtable generation. The consistency check will still happen per-level, but only when a level is actually used, not when a pack containing it is cached.

The underlying principle is that pieces in the hierarchy know about the hash values of their underlings, but not their own. I.e. you merely shouldn't put the level hash into the level itself and then rely on that alone.

Edit: The hash serves two purposes, which are quick retrieval of a level, and consistency checking. For retrieval, cache as much as you can. Consistency checking should happen only as much as necessary. The principle of the paragraph before is a way to satisfy both.

-- Simon

ccexplore

Linking a replay automatically to a level file is interesting and kind of convenient, but seems like a lot of work to solve a problem that isn't that big of a deal.  (If I'm interested in watching a replay I probably already have the level, and it's not rocket sciene to find and open the level in question.  If the replay was recorded for the wrong version of the level, it'll quickly become obvious when the replay starts going obviously wrong.)  But I guess if you're up to it, go for it then. http://www.lemmingsforums.com/Smileys/lemmings/wink.gif" alt=";)" title="Wink" class="smiley" />

And using SHA seems like over-engineering.  Isn't a simpler CRC enough?

EricLang

I did some tests with a 128 bit number XORING the complete memory stream of a level, which seems enough hash to me http://www.lemmingsforums.com/Smileys/lemmings/smiley.gif" alt=":)" title="Smiley" class="smiley" />
Combining this with the random generated levelnumber -used as a simpel lookup- it will probably work as intended.
The 128 bit number will be used as validation, and *is* stored in the header of the level.

Opening a replay file instantly playing the correct level is a big improvement in my eyes.

ccexplore

http://www.lemmingsforums.com/index.php?topic=523.msg11207#msg11207">Quote from: EricLang on 2011-08-01 17:26:38
I did some tests with a 128 bit number XORING the complete memory stream of a level, which seems enough hash to me http://www.lemmingsforums.com/Smileys/lemmings/smiley.gif" alt=":)" title="Smiley" class="smiley" />

I'm wary of using just XOR, due to its properties like x ^ 0 == x and x ^ x == 0, which seems like it can easily lead to collisions (ie. two levels hashing to the same value) even when done on 128-bit.  It might work better if you apply it to the compressed memory stream of the level, I'm not sure.

Perhaps there's some merit to using SHA after all even though it's slower and more complicated, at least it vastly reduces the likelihood of collisions (ie. two levels hashing to the same value).  CRC might be a good middle ground but I don't honestly know what the comparison is to SHA in terms of collision probability.

http://www.lemmingsforums.com/index.php?topic=523.msg11207#msg11207">Quote from: EricLang on 2011-08-01 17:26:38
Combining this with the random generated levelnumber -used as a simpel lookup- it will probably work as intended.

The use of random number negated what Simon mentioned, that it won't support a situation where you can ask the user to make a trivial modification themselves (eg. change initial RR to 99) and be guaranteed that the lookup will work.  If this is acceptable then I guess it's not an issue.

geoo

I don't see any point whatsoever in having a random number when you're already using a hash. As Simon said, the hash can be used for both finding the corresponding level to a replay (assuming the replay contains the hash value) in a hash table as well as an integrity check for the level.
The random number also has the aforementioned issue that if two players have the same level independently (for instance by changing a parameter to an existing level), that the level won't necessarily have the same random number.

SHA-1 (or any other commonly used hash function) should be easy to use because there are already implementations of them available.
XORing is prone to collisions, especially if you have a well structured level format, and thus doesn't serve too well for indexing nor integrity checks I think.

The versioning problem (i.e. having two versions of the level that are essentially the same, i.e. a solution recorded for the old version also works for the newer one) can be solved by allowing an option to force a level with a replay, and optionally overwrite the hash with the new one.
Under normal circumstances, changes to an existing level are significant though, and allowing to assign it the same random number (which doesn't work with hashes) isn't desirable: a user is prone to screw up if there's an option for it, as levels are saved very frequently in the design process; and automatically detecting whether a change is significant or not is impossible.

Mindless

If I were implementing it, I would use some sort of hash without any random numbers.  SHA-1 is popular enough that you ought to be able to find an implementation in your language of choice since you probably don't want to reimplement it.  We're not specifically looking for a strong hashing function here, so even CRC32 which wasn't designed as a hashing function would probably get the job done.  Or if you want to go nice and sturdy, use a http://en.wikipedia.org/wiki/Tiger_(hash)" class="bbc_link" target="_blank">Tiger hash.  http://www.lemmingsforums.com/Smileys/lemmings/laugh.gif" alt=":D" title="Laugh" class="smiley" />

I wouldn't worry about performance either.  Hashing isn't all that expensive.  Chances are good that whatever hardware you're using can hash faster than it can read data from a rotating magnetic disk.

EricLang

I found the adler crc. not tested yet.
If a level is created and saved, the random number will be there forever. Also when you edit it, or move it into another levelpack, or save it as a separate file.
I have converted the original dos lemmings now to the new Lemmix file format and will update you guys here with the "find replay: mechanism.

There are a lot of old lemmix replay files and a lot of custom levelpacks around.
I hope I can make a good conversiontool for these.

Simon

<geoo> SimonN: from his post, it sounds like he's still using that fricking random number
<SimonN> Yeah. If he keeps this, he MUST be the centralized source of levelpack conversions, or people will all end up with different random numbers.
<SimonN> Centralized = BAD BAD BAD
<geoo> yeah


Hmm, if you make a conversion tool, then there must still be a centralized source of levelpack conversions, or people will end up with all different random numbers. I do not think this is feasible for the large number of levels. In addition, you might want each user to convert L1/ONML levels himself due to their copyright restrictions.

I really urge you to get rid of the random number in this prospect. Either hash deterministically, or use the old manual way of associating replays to levels.

-- Simon

EricLang

You're right. Converting will give people different random numbers. That's not good.
And what are these stupid little characters for?
Besides automatic linking you can of course choose manually as well. For several reasons.

Simon

Any news on this project?

-- Simon

EricLang

The news is that real life intervened - again.