2.00 Compression of files?

Started by namida, October 05, 2015, 04:14:54 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

namida

Final decision: This is now irrelevant, since I've given into the demands and decided to use text / image file based formats.

So - this should be a straightforward one. Does anyone feel that there's a need to compress files for local usage? Obviously, file compression for anything that's to be distributed is a must, but do people think we're beyond the days where we really need to worry about it for local data, outside of the compression inherent in some specific, non-NeoLemmix-specific formats (such as OGG)? Or should it be retained? (In the event it is retained, most likely ZLib would be used.)
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

Simon

Go for OGG and PNG. I wouldn't compress anything else.

Nyargh, make things accessible first, only then small. :lix-evil: OGG and PNG are reasonably accessible formats, even though they hold compressed data.

Especially with stray levels in single files, you don't save much space by compressing each file per se. The space-saving approach would be taring them all up into one compressed blob. :lix-suspicious: Optimize for the appropriate metric, which isn't necessarily filesize.

-- Simon

namida

#2
PNG is unlikely to be used much, except as an import / export format. Although if a folder-based structure for graphic sets is supported, then PNG will almost certianly be a supported format (along with BMP, and maybe GIF (only because GIF support needs to be implemented anyway to support Lemmini graphic sets)). In actual NeoLemmix data files, a much simpler format is used - just a width, a height, and 32-bit pixel data. (In the case of a group of images, such as an animation or a font, it's preceeded with a frame count, then a true/false byte to mark whether each image has a size specified individually, or if all of them use the first image's size. And I'm not even sure that the whole "shared size" byte is nessecary, rather than just specifying the sizes each time no matter what - it's only 8 extra bytes per frame, after all.)

Some kind of binary-based format is preferable, because otherwise a piece would need to consist of at least two files at the very minimum (those being a metainfo file of some kind, and an image in at least one resolution). More than that in more-complex cases, such as a piece that has some parts which are steel and others which are not (not likely to happen with average pieces, but perfectly reasonable with VGASPEC-style ones). Then add into account trigger areas / points for objects (the former of which is strongly being considered allowing arbitrary shapes / non-continous areas / etc - questionable whether it'll be used, but it probably won't hurt to have, even if the most common use will probably be to have circular trigger areas instead of rectangular ones), as well as multiple frames - possibly in multiple resolutions. And then take into account that a single graphic set tends to have more than one piece in it. Now you see why I prefer a binary format?

Indeed, I wasn't thinking about compressing levels (unless they're in a for-distribution pack file). This was more relating to system graphics and graphic sets than anything else. (I had initially considered prepending every file with a byte identifying the type of compression (possibly none at all) and always handling such on load, but I quickly realised this was fairly pointless.)

Now in terms of sound, yes, using OGG makes the most sense. (I must make a point of converting all the standard sound effects to OGG, instead of keeping them as WAV files.) Custom handling of sound in this way is not nearly as straightforward as doing so for graphics (either that, or I just don't know shit about how sound files work), and at any rate, since an external library capable of natively using OGG files is used to handle audio, there's little reason not to use it. Heck, other formats being supported (apart from IT / MOD / etc, which I'm sure I don't need to explain the reasons for supporting) - such as WAV or AIFF - is more because there's little if any extra work involved in doing so; not because I actually think using them is a particularly great idea.
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

geoo

QuoteSome kind of binary-based format is preferable, because otherwise a piece would need to consist of at least two files at the very minimum (those being a metainfo file of some kind, and an image in at least one resolution). [...] Now you see why I prefer a binary format?
No, not really.
A badly organized folder for each graphic set is still better than a binary blob. And well, you can guess about a well organized folder...
For one, you can use subfolders which you can use e.g. for zoom levels, different kinds of objects, or even if you want to sort the terrain pieces thematically.
The single downside I see is that users will have to unpack a new graphics set instead of just dumping a blob in the styles folder. But graphics set files could just be compressed folders, then you even work around that issue of the user botching up unpacking an archive. (Only downside there is you can't really version graphics styles, and updates to graphics set are more annoying, unless you always distribute the whole blobs which you propose anyway. Which is kinda funny considering you're trying to shave off bytes here and there in the file formats, and then encourage users to shove huge blobs across the internet.)

Now the upside is that you don't have the redundancy between a 'source' version of a level pack and a compiled blob version. Which also means you don't have to write tools to create such blobs. But to me, the main point here is that other users can contribute to other graphic sets and/or level sets a lot more easily, make actual suggestions (e.g. new/altered pngs) instead of being limited to nagging the graphics set author. I think the issue was very prevalent with Revenge of the Lemmings where users were unable to fix their own levels just because they were all in one big blob. Instead people had to describe the desired changes, and mobius would implement them. Waste of everyone's time. And it actually discourages people from working together.

I can see that sometimes some people might want to distribute their pack to non-Lemmings Forums people as a standalone executable, but this shouldn't be the default behaviour for those people who play more than one level pack. Just have all the data be organized in folders by default, and give the option to pack everything as a standalone executable if you think people are interested in this.

ccexplore

@namida: so we are basically talking about whether level files should be compressed or not?  I'm not that familiar with the files usage of NL so it's not completely clear which specific types of files we are thinking of not compressing.  Also interesting to know what you feel would be the advantages of not compressing.

Without the full context of things, this feels like an internal implementation details sort of thing that I'm not sure would lead to any end-user-impacting effects, other than usage of storage.

namida

Quote@namida: so we are basically talking about whether level files should be compressed or not?  I'm not that familiar with the files usage of NL so it's not completely clear which specific types of files we are thinking of not compressing.  Also interesting to know what you feel would be the advantages of not compressing.

Basically - any file intended for distribution (which won't be a standalone EXE, but perhaps more comparable to a ZIP file - except with metadata that's useful to NeoLemmix, rather than useful for decompressing a folder structure) would be compressed. But NeoLemmix would "install" these files, not use them directly. In terms of locally-stored files, I'm mostly only considering whether or not to keep graphic files compressed - I don't see levels getting large enough that it'd matter; in the current format, the largest level file I'm aware of is still only just over 11KB ("Goodbye Galaxy" from LPIV). I'm not sure exactly what sizes we're looking at for graphic sets in NX2 (given that multiple resolutions will be storable); I'm not far from the point where a file format for them needs to be implemented, so once we're at that point I can give some more useful numbers.

But yes, the only real impact on the end user would be storage vs execution speed (though I doubt it'll have too large an impact on either).
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

Simon

#6
Think very long and hard whether you want to keep graphics sets, or abolish them in favor of a dir with images.

I want to write an article/thread about this, but I'm too tired now. (Edit 2015-10-07: Thread on why plain graphics/text files are superior to any custom file format.) I want to make a full argument, it's important. The bottom line is, hours and hours have already been wasted because NL doesn't allow simple dirs with images, and I've been afflicted myself directly at least once.

-- Simon

namida

#7
What I'd be more interested in is - how is it either going to be simpler for the user, or allow doing stuff that isn't possible otherwise (taking into account that a single level will be able to use multiple graphic sets if desired)? I had considered splitting them into seperate files, possibly organised via folders (to create something similar to a graphic "set"), for each piece (but the files themself would still be a NX2-specific format, so that all the various images as well as metadata for a piece are contained in a single file), but I'm not seeing any huge advantage in this. On the other side, I could see it causing problems if - say - a user makes a pack that only uses a few pieces out of what's meant to be a "graphic set", and thus when they distribute their pack, other users (who may in turn want to use this set) end up with only parts of the set, rather than the full collection (which is not possible if graphic sets are maintained).

Sure, a directory structure might be a tiny bit simpler to implement on the programming side of things. But I can see it causing a lot of hassle for end-users that a proper graphic set format would not. One possible compromise is to support both, and that is already something I am considering - that way a graphic set can exist as a collection of files during development, then be packaged into a tidy single-file format for distribution once it's completed. Since the decision has already been made not to restrict use of content, if someone really wished to have it in directory structure format - either due to wanting to make changes to it (though for obvious reasons I'd advise using a new name in such a case), or simply because they would rather other formats were not used - it'd be completely possible to convert it in the editor. A similar thing is already possible with the existing NX1 graphic set tool, with the catch that NX1 can't directly use the directory structure (rather, the tool has to be used to convert it back to the NX1 native format first), and I have indeed found it useful during development of graphic sets - but very little reason to continue using it for a finished set. Since NX2 is being designed in such a way that support for extra formats can very easily be added, it would be little (if any) extra work to support directly using these formats, compared to a setup where they can be imported/exported but not used directly.

EDIT: At any rate, this kind of dicussion should be going in the Graphic Sets topic, not this one... compression is the issue for this topic; which would be equally relevant if stored as single pieces (but quite possibly less efficient in such a case - not sure exactly, I'm not overly familiar with the inner workings of ZLib). Yes, it would be irrelevant if PNG is used as the format - which it most likely would be used as the primary one (with BMP and maybe GIF supported as alternatives) for a directory structure based format, but even if graphic sets are split into single files per piece for the native format (which is a decision I'm unlikely to go with), PNG would not be used there; it'll use the same image format used everywhere in NX2 for the image sections of the data, which is simply a width, a height, and 32-bit pixel data for each pixel; which might then get ZLib compressed. No extra headers, support for various bit depths, padding, or anything like that - this is about as simple as it can be kept.
My projects
2D Lemmings: NeoLemmix (engine) | Lemmings Plus Series (level packs) | Doomsday Lemmings (level pack)
3D Lemmings: Loap (engine) | L3DEdit (level / graphics editor) | L3DUtils (replay / etc utility) | Lemmings Plus 3D (level pack)
Non-Lemmings: Commander Keen: Galaxy Reimagined (a Commander Keen fangame)

geoo

QuoteWhat I'd be more interested in is - how is it either going to be simpler for the user, or allow doing stuff that isn't possible otherwise (taking into account that a single level will be able to use multiple graphic sets if desired)? I had considered splitting them into seperate files, possibly organised via folders (to create something similar to a graphic "set"), for each piece (but the files themself would still be a NX2-specific format, so that all the various images as well as metadata for a piece are contained in a single file), but I'm not seeing any huge advantage in this. On the other side, I could see it causing problems if - say - a user makes a pack that only uses a few pieces out of what's meant to be a "graphic set", and thus when they distribute their pack, other users (who may in turn want to use this set) end up with only parts of the set, rather than the full collection (which is not possible if graphic sets are maintained).

Now this...doesn't make much sense to me.

Say you have a graphic set designer working on a new graphic set. Certainly it is easier for him to have an image file, maybe a another image file with a mask if you support arbitrary trigger areas, and maybe a text file with meta information (for objects...terrain pieces are self-contained, encoding e.g. their size would be redundant): Work-in-progress means lots of changes and lots of testing. It is significantly easier to just change the image file and run the game to test, than change the image file, send it through some converter tool where you have to enter the meta information, and then run the game to test. Every single time. The latter just so we have 1 file instead of 1, 2 or 3. If a designer doesn't put much effort into their graphics set, maybe they'll have to go through the conversion process only once for each tile. But if you're actually ambitious and keep tweaking things instead of doing a rush job, then this is certainly gonna be huge pain.
For the user who just wants to play levels it's not gonna make any difference. If you're really so afraid of them botching things up, put the folder structure into a compressed archive and change the extension for good measure.
The designer just distributing half of his graphics set because half of his graphics are so useless that it's used in none of his levels, and he actually takes all the effort to go through every level to check which tiles are unused...umm sure. Sounds a bit far-fetched to me. And you know, your designer could also just compile a reduced-tiles tile set and name it the same if he's going through so much effort to save a few bytes. And different work-in-progress versions of a graphics set floating around...you have that with binary blobs too.

That said, most of these things probably won't affect me all that much personally, as I only aim to play some level packs in NeoLemmix and not create my own content (that's what I got Lix for). Just the idea of people wasting hours on menial tasks that they shouldn't be wasting time on makes me cringe, as has happened in the past.
My personal interest is seeing NL2 released rather sooner than later, because the only feature I really care about is customizable hotkeys. I thought about asking you if you could compile a NL1 version just with my favourite hotkeys hardcoded, but then I realized that this doesn't solve anything as most level packs are all compiled into the huge binary blob that the main executable is.
Whatever. Don't rush things. But avoid feature bloat.