ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

Planning a major programming project - tips requested

<< < (5/5)

capitalH:
This will be a large file containing user data. There will be external files stored in their raw formats (jpeg, pdf, etc). I was debating about an already designed database system like sqlite, but also know that I have written my own file save procedures in the past and it has worked out nicely.

I was advised AGAINST XML because of the sheer complexity of implementation. There is no desire to export this data, as my application will be the first in its industry.
-Josh (January 12, 2012, 10:28 PM)
--- End quote ---

I don't think you can underestimate interoperability and ease of access. Users will thank you for it, if they ever need to access the data outside of your app.

I wouldn't say XML is complex, as long as you can use a 3rd-party parser that does the heavy lifting for you. But it is (a) relatively slow to read (b) wasteful when the tags take up more bytes than user data (c) potentially memory intensive, depending on the parser and how you use it; (d) brittle, because a single wrong byte may prevent your app from loading any of the data (that's particularly important if your users may ever fiddle with the data files themselves).

But XML is very useful and flexible when you need or expect to (a) represent hierarchical data; (b) store data of different types, including binary; (c) modify and expand the capabilities of your app or your data files. You can easily add a new tag to store a new piece of data, which earlier versions of your app may just ignore without breaking. Doing the same in custom binary files takes more work, as you have to introduce some sort of "file version" marker and check for it all the time (version 1.1, so expect a date now; version 1.2, time is now stored as UTC; version 1.3, added a comment field here...) That last thing is much easier to do with XML.

You could also design your custom "binary" format that is structured like XML, with numeric identifiers instead of tags. That worked for me once, but after that I decided SQLite was easier and faster and more capable to begin with.

SQLite is really neat in that it doesn't need a server or any complex installation (or any db license). If you use MySQL, who's going to install it, and who's going to do tech support for MySQL issues, for example? For me, the main advantage of a database is that you can read and write small portions of your data instead of the whole datafile at once. And you get a lot of built in logic for free via SELECT queries that you would otherwise have to devise yourself. Frequent disk access is an obvious downside, although you can configure your SQLite database to stay in RAM (by specifying a sufficient cache size or even explicitly copying the db to RAM).

Another downside of SQLite for me is: interminable query strings everywhere. I do keep them all in once place actually, but unless you generate all of your SQL at runtime, you may end up with hundreds of SELECT statements that are hard to maintain. If you add a column to a table, you need to go through all of them to see which ones must be modified. I hate that part, and you can of course parametrize the queries and glue them together from pieces, but the pieces still remain as string constants, and that's bad. Still, right now I wish I'd written almost all my apps with an SQLite backend, and I'll be using it for practically every future project I can think of. It's really flexible, and by far the fastest database engine on the desktop.
-tranglos (January 13, 2012, 09:46 AM)
--- End quote ---

I am more a data guy than a programming guy (statistician by trade) so here goes my 2c:

XML - very useful for hierarchical data, and easier to use than it looks at first. Though my first attempts failed miserably (when I attempted to do my own parsing) with MSXML (or potentially another library) it is dead easy. Speed is not an issue with smallish data (I used it successfully for 10 000 records, about 50 attributes per record, very hierarchical) - though I only use it during load/save - not updating it continuously.

sql (or other database) - very useful if you mostly work on subsections of the data, or if you have huge data and can with with a subsection at a time.

CSV - my personal favorite. Works everywhere, you decide the format. Open and transparent. For some protection - compress it before saving. Does not work that well with huge data (limited by memory usually).  Does not work well with hierarchical data or data that can have a variable number of attributes per record (say for example you have a bunch of projects (records) and for a project a list of people working on it - you will probably have to have a cell that has a separator which)

Navigation

[0] Message Index

[*] Previous page

Go to full version