Kelv's Random Collection

A random collection of my contributions to the world.

Little Endian Bitfields

Posted by kelvSYC on 6-13-2014

In Synalyze It!, you can create integer fields of various sizes.  With whole numbers of bytes, you can make this into little-endian or big-endian structures.  This is all fine and good, but because of the fact that Synalyze It! has to process structure members in the order that they are declared, and a structures can have the property that the size of the structures is entirely determined by the contents therein, it means that true bitfields (where multiple integers are being crammed into a whole number of bytes) can only be rendered with Synalyze It! primitives if the bitfield is encoded as a big-endian integer.

This presents a problem: a lot of platforms are little-endian or mixed-endian, for one.  You could somehow get away with it if your fields somehow align neatly with byte boundaries, but this doesn’t always happen.

To get a better idea of what I am speaking of, consider a 16-bit integer acting as a bitfield.  This 16-bit integer consists of a 7-bit integer and a 9-bit integer.  If you try to model it using Synalyze It! primitives, the bitfield will always render it as follows:

xxxxxxx- -------- 7-bit integer
-------x xxxxxxxx 9-bit integer

This is regardless of the endianness of the integer that these two fields have been packed into.  Why is this, given that all structures have an endianness property?  That’s true, but that just refers to the endianness of each individual field, and not an instruction to render the structure with the bytes in reverse order.  (Again, Synalyze It! generally does not know the size of a structure before rendering it, and even if you had a fixed-size structure, Synalyze It! will not reverse the bytes contained within before rendering.)  Thus, bit field parsing only works if the 16-bit integer was big-endian.  If this was a little-endian integer that these two fields are packed into, then we have this in actuality:

-------- xxxxxxx- 7-bit integer
xxxxxxxx -------x 9-bit integer

In other words, the 9-bit integer is now in two pieces, and cannot be modelled by a single field.

Now, there are several ways you can work around this:

  • Coding the bitfield as an integer, and extracting the two fields using scripts.  The pro is that you can do this even without the Pro version: bitmasking will, to some extent, allow you to extract individual fields.  This is generally good enough if every field was one byte, but it also makes it impossible to fix some fields while leaving others unfixed, as, after all, Synalyze It! still considers your bitfield as a whole and never each field individually.
  • Model any field that spans multiple bytes as separate integers, and extracting the true value using scripts.  In order to create a script element that extracts the value, you would need to traverse the results tree.  This is generally feasible via Results::getResultsByName().  Then you can extract the values, and manipulate them via Lua or Python’s regular integer tools, and then insert a value into the results tree to represent the actual value.  The downside to this idea is that you still either have to deal with cutting the results tree to remove the original values (which also removes the ability to alter a value in the results tree and have the changes propagate to the actual file), or have to live with extraneous data polluting your results tree.
  • A custom element.  Custom elements provides the maximum flexibility, in that you have greater freedom to insert exactly what you want in the results tree.  There are two things of note: you are inserting a structure into a tree rather than a single value, and your inserted structure is now read-only: you don’t really have a “structure value” type, so implementing the fillByteRange() function is impossible.

The latter two approaches also means that you would have to custom-make this for every little-endian bitfield you encounter, and their implementations require the use of two techniques that I have found to be useful: zero-length script elements in the second approach, and “manual mapping via prototype” in the third approach.

Zero length script elements are fairly straightforward to implement:

currentElement = currentMapper.getCurrentElement()
currentMapper.addElement(currentElement, 0, 0, value)
return 0

The custom element approach is something entirely different.  First, you must create a top-level structure that will act as your prototype.  This structure won’t actually appear anywhere else in your grammar, but you can set up field sizes, element names, and such.  After creating this prototype structure, then, within parseByteRange(), you can then refer to your prototype structure via

currentGrammar = element.getEnclosingStructure().getGrammar()
prototype = currentGrammar.getStructureByName("Prototype")

From there, you can then use the given ByteView (byteView) to extract the necessary data, prototype.getElementByName() to retrieve the Elements corresponding to your fields, and simply add to the results tree: results.addStructureStart() takes in your prototype structure and results.addElementBits() takes your field Elements and Values.

That’s still quite a lot to do in order to properly render a structure that’s packed into a little-endian integer.  What’s still worse is that the custom element approach will still have a tendency to misrepresent your fields’ actual location in the hex field (the 7-bit integer, despite being solely taken from the second byte, will appear to be coming from the first byte).

In short, none of the solutions will give you the ability to render all and only the fields that you want without sacrificing generalizability, mutability, improper representation of rendering in the hex view, or script-free-ness.  See which approach works for you.

Advertisements

Comment on this

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: