Archive

Author Archive

Git Bisect and Why Commit Early and Often

February 12, 2012 1 comment

Recently, I discovered that sometime in the past, we had developed a bug where only our first conversion would succeed. A stack trace pointed to a memory corruption bug in our converter list. The converter list contains all of the converter objects, which are responsible for everything related to a loaded converter. As a result, they are accessed all over the program, and when I verified that all of them were created properly I was concerned that it was going to be a really nasty issue to resolve. However, as it was a fairly reliable crash that was easy to reproduce I decided to give git bisect a try.

For those who aren’t familiar with git bisect, it is designed for exactly this situation. You provide it with a known good commit and a known bad commit, and I performs a binary search on the commit history, giving you commits to test. After a few re-compiles and some quick test, git bisect helped me determine which commit was responsible for introducing the bug. This commit was “Now converts with a copy of a converter.” It was a one line change, replacing a pointer copy with the creation of a new object using a copy constructor. Class converter has no functional copy constructor, only the default compiler-generated one. Now the source of memory corruption was obvious, we were duplicating pointers, and trying to access them after they had been deleted.

This is exactly the sort of situation where a version control system shines. Because I could look back in time to see what had been done in the past, and because the commit was a small one making a single change, I was able to determine the cause of the problem much more quickly than I would have been able to had I needed to sift through all the code that interacts with converter objects. Lesson learned, commit early, commit often, and use your development tools where they shine, and they will make your job as a developer a lot easier.

Advertisements
Categories: Uncategorized

Universal Batch Converter 0.1

January 22, 2011 Leave a comment

I am proud to announce the release of the Universal Batch Converter 0.1. This being our first release, it’s fairly basic.  Out of the box, we support conversions between FLAC, WAV, and OGG audio files. Additionally, we support MP3 via LAME and various image format conversion via ImageMagick’s convert program, however these are not bundled in this release. Support is available to add more types for conversion by writing XML files. That is the biggest reason for this release. It represents a conforming implementation of version 1.0 the XML converter specification format as documented on our wiki. There are plenty of bugs and missing functionality, however the program should be able to handle conforming XML files, and as a result can serve as a testbed for adding support.

We have a windows release available now, with  a source bundle and static builds for Linux to follow. It is also available via git.

Categories: Uncategorized

Delays and Backend Architecture

December 19, 2010 100 comments

We haven’t made a lot of progress due to the whole development team having finals, but we have several showstopper bugs, to the point where our program really isn’t useful. Most of the problems stem from integration issues between code written by different people. While I’m working through some of these issues I thought I’d document our backend architecture here to make sure I understand it completely, to give some insight into our program, and maybe get some feedback on our design. I try to keep it as high-level as possible, but it does get a bit technical.

First, some background.  We have a class, BackendManager, which is the overall manager of conversions.  Inside BackendManager, we have two main abstractions, Converter and Conversion.

Converter is a standard interface to your various converters. It abstracts away all the specifics of the individual programs.  It provides a list of supported input and output formats, and all the configuration is setting the input and output formats to use, checking that a valid configuration was possible, and executing with input and output files.

Conversion is a container class, with an input filename, an output filename, a pointer to the Converter to use, and a pointer to the next Conversion for when multiple converting steps are required.
At program startup, a BackendManager object is created.  It is then given a directory of XML files, from which it builds the global set of Converters.  The actual building of the file is handled by the Converter, so BackendManager just has to iterate through the files and create a Converter object for each one, feeding it the filename of the XML file.  It then builds a hash table of valid single-step conversions from all the loaded Converters.

The program can query BackendManager for the supported input and output formats, and additional can check if an output format is supported for a given input. Currently it only checks single stage conversions, but that is on the todo list to resolve.
The program can also request a conversion be setup for a given set of inputs and outputs.  It is returned a Conversion pointer.
It can then give these Conversion pointers to BackendManager when an actual conversion is to be performed. We plan to refactor so that external code doesn’t need to know about the Conversions at all later. Each Conversion pointer is assumed to be independent of the others, with dependent Conversions indirectly passed by the pointer for the purpose within Conversion.  BackendManager then kicks off a conversion process in a separate thread with ConversionExecutor. ConversionExecutor creates new threads for each Conversion using ConversionStarter, up to the maximum number of concurrent threads, currently hardcoded to 5. Once 5 threads are running, it waits for one to finish, and then starts another as each does until it runs out of Conversions to run.

That’s our program’s backend, more or less. The current bugs considered showstoppers are that our program has:

  • called the wrong converter
  • not configured a converter
  • lost conversion steps
  • crashed instead of converting

As soon as these are straightened out such that our program actually converts as it is asked, at least most of the time, we will get a release out.

Categories: Uncategorized

Still Alive

November 27, 2010 6 comments

Despite the lack of visible work, progress is being made. I just merged in my local changes, which bring the Converter class to about where it needs to be for our first release. Kotarf and fudgemutator have been planning and starting in on the implementation of converter management. Jonnyk127 continues to work on the GUI. Blahbots is building a graphical converter description file creator. Casperdogg is learning cmake to get a working build system up and running. We’re working towards getting a first release together, so that others can start playing with our code.

Categories: Uncategorized

Progress and CLI options

November 4, 2010 7 comments

We continue to make progress.  We now have a target feature set for an initial working build, and are closing in on completion.  I just performed the first successful conversion using my Converter framework, the descendent of the Plugin architecture I talked about in our first presentation.  We are targeting for feature completion tonight, so that we can work out integration issues and get a program that actually runs.

For the most part, progress is going well.  Within my code, the biggest obstacle to having a full set of basic functionality is what I’ve been referring to as argument options.  These are command-line flags that take arguments.  The problem is there is no clear standard for these options, from -l{filename} in GNU ld to -V{0-9} in LAME to -loop {number} in mplayer and -o {filename} in FLAC.  The Converter framework needs to:

  • Pass the arguments to the program correctly – programs with a space before the option  expect it as the next argument passed, while those without expect in the same argument.
  • Handle the option content – some options are one of a set of possibilities, some are a number, some are an arbitrary string.
  • Provide configuration hints to the GUI – We are generating the settings pages dynamically, so I need to tell the GUI what is valid, and give info on a sensible way to expose it.

Once this issue has been dealt with, the framework will be largely complete, enough to start building out converter support.

Categories: Uncategorized

Metadata and Semantic Meaning

October 23, 2010 157 comments
Recently I’ve been diving into handling file format metadata.  A lot of common files have metadata, and it is generally expected that the metadata will be retained across format conversions where possible.  Music files have song titles, artist names, album names, track names, etc.  Digital cameras often embed date and time of picture, and cell phones can embed location.
Handling this in a generic way is both easy and hard.  The current method I’m using to identify data is to embed semantic tagging in the abstraction of a converter.  For example, LAME’s id3 title argument is tagged with <semantic type=”music” tag=”title”/>.  This provides a very simple mechanism for giving meaning to the data, as we now know that it contains the same information as other things tagged “title”.  As simple as this is in concept, there are several potential problems to mitigate.  The first is text encoding.  There is an excellent standard for encoding pretty much any form of text, unicode.  Unfortunately, people don’t yet universally utilize this standard, and in practice text encoding can be a mess.  Additionally, non-string metadata will be encountered eventually.  For now I’m sticking to string data, but all metadata handling is wrapped in a class with functions for setting and getting, so as new data types come up it should be relatively extensible.
All interfraces capable of handling metadata inherit the Semantic interface class, which means they provide simple functions to handle metadata transfer.  Once the plugin code fully implements the Semantic interface, metadata transfer will be as simple as passing a Semantic pointer from the source to the loadSemantic() function.

Recently I’ve been diving into handling file format metadata.  A lot of common files have metadata, and it is generally expected that the metadata will be retained across format conversions where possible.  Music files have song titles, artist names, album names, track names, etc.  Digital cameras often embed date and time of picture, and cell phones can embed location.

Handling this in a generic way is both easy and hard.  The current method I’m using to identify data is to embed semantic tagging in the abstraction of a converter.  For example, LAME’s id3 title argument is tagged with <semantic type=”music” tag=”title”/>.  This provides a very simple mechanism for giving meaning to the data, as we now know that it contains the same information as other things tagged “title”.  As simple as this is in concept, there are several potential problems to mitigate.  The first is text encoding.  There is an excellent standard for encoding pretty much any form of text, unicode.  Unfortunately, people don’t yet universally utilize this standard, and in practice text encoding can be a mess.  Additionally, non-string metadata will be encountered eventually.  For now I’m sticking to string data, but all metadata handling is wrapped in a class with functions for setting and getting, so as new data types come up it should be relatively extensible.

All interfraces capable of handling metadata inherit the Semantic interface class, which means they provide simple functions to handle metadata transfer.  Once the plugin code fully implements the Semantic interface, metadata transfer will be as simple as passing a Semantic pointer from the source to the loadSemantic() function.

Categories: Uncategorized

Progress

October 10, 2010 4 comments

Figured I’d provide an update, because we haven’t had a lot of visible progress.  Work goes on under the hood with core code, we have an abstraction framework for code execution so that running programs is platform independent and uses the C++ STL.  As a result 40 line test files can convert files, and have been used to convert vorbis and flac for my iPod.  There is a lot of code building out the converter abstraction, that will hopefully soon be complete enough to build test code with.

Categories: Uncategorized