You mean to use EXCEL files?
That sounds good., and are there any kind of files similar? That can store blobs., etc
-kalos
Excel doesn't use data types like that, so you can't store a blob in it. Blobs are for RDBMSes.
Can you post a PDF?
-Renegade
but, cant EXCEL store a pdf or any other type, file in a cell?
if not, which specific RDBM would you suggest? (for a veeery simple task as this, that doesnt require to learn much)
as for the pdfs, an example is this:
http://www.purolite....n%20Chem%20Specs.pdf
(renamed as Hydrochloric Muriatic Acid.pdf)
-kalos
Well, yes, Excel *can* have a file embedded, but not in a cell (AFAIK).
For an RDBMS, just about any will do. Take your pick of whatever you like really.
For the PDF... Sorry. You're hosed. Completely hosed.
The data in the PDF is in the first normal form (and *maybe* the second normal form). i.e. It's all mixed up and jumbled so as to be useless as a database.
Now, it *is* possible to write some software to go through all the different cases and to parse it all, but it is VERY far from being a trivial/easy task. Actually, it's pretty easy, but it is extremely time consuming.
The only solution I see that is remotely quick is to cut each line at the = sign then use the left side as a key and the right as a value. Keys can either be unique or not, but I would probably want to normalize things and try to force them to be unique.
Someone else may have a better idea.
As for extracting the data from the PDF, no clue. I loathe working with PDFs because they are a terminal format where once they are in that format, that's the end. Trying to get anything back out of them is a nightmare. You can try to save a PDF as a DOC to see just how terrifying the monstrosities that are produced are...
I couldn't save the file you had as a DOC because it has a <bad font>, whatever that means... (not even as HTML...)
I tried copying & pasting... No luck. Cells are collapsed to spaces, which effectively reduces any hope of the second normal form to the first normal form, i.e. useless.
It's better to go from database to PDF. That makes sense and is manageable. Going from PDF to anything is nigh hopeless.