topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Saturday December 14, 2024, 12:17 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: any spreadsheet-like tool to open tables with millions of rows/columns?  (Read 9334 times)

urlwolf

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,837
    • View Profile
    • Donate to Member
Do you know of any spreadsheet-like tool to open tables with millions of rows/columns?
I really like emeditor because it can open large files without placing them into memory.
If I could find a spreadsheet tool that did the same, it'd be great.

Thanks

tinjaw

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,927
    • View Profile
    • Donate to Member
I think this is a parallel of owning a hammer and everything looking like a nail. If you have that much data, you need a database and a frontend. Now, you can use a spreadsheet as a frontend to the database, but you need that data in a db. Then you only view the results of calculations in the spreadsheet.

But you are a smart guy, so I must be missing something.  :huh:

tide

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 84
    • View Profile
    • Donate to Member
I agree with tinjaw. I can't imagine a supermassive spreadsheet with any kind of structure being able to look stuff up in a reasonable time to say nothing of recalulating.

urlwolf

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,837
    • View Profile
    • Donate to Member
I do have a database :)
I still miss some things a spreadsheet can do.

I'm right now torn between different modes to analyze data.
Right now I'm R-centric.
(www.r-project.org).
But in R you do everything in the command line. all changes are logged, which is good. Spreadsheets are a big no-no in this community.

The rdbms community has different ideas about how to manage data.
Basically you get what you want a query at a time.

And of course, Joe six-pack does everything with excel.

I wonder if I'm not missing much of the attractive of a visual way to handle data. Color-code cells according to contents, zoom in and out with an slider, ... that kind of feeling. Very exploratory. Compare that to a command line where you do head(data) :)

Since emeditor and R prove that you can do lots of operations without placing things into memory (or screen) I see no reason why one cannot have a streaming spreadsheet that loads the parts you need on the screen really fast.

Doing operations by selecting block would generate the equivalent SELECT for the db... if you need one. There might be very fast alternatives.

I think there are many paradigms for data handling that are not that well explored. For example, comparing OO, RDBMS and RDF (linked data) is eye-opening. This is beyond the point I'm bringing here about a streaming viewer (btw GUIs for RDBMS do suck at this; it's not that hard!)... but still. Let me know if you want more info or if I'm not making sense at all. I have a paper in mind on Object-relational mappings that is very good, if a bit technical.

MikeMcLoughlin

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 31
    • View Profile
    • Donate to Member
Have you looked at CSVEd? It may do what you want.

CSVEd

tinjaw

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,927
    • View Profile
    • Donate to Member
I am having a difficult time summarizing what I am thinking. That's no excuse for me to just ramble, but it might come out that way.  :-[

You are mixing apples and oranges here. I am going to hope that you understand Model, View, Controller (MVC) architecture. In layman's terms Data, Display, and Calculations. The focus of a spreadsheet is View and Controller, while the focus of a RDBMS is Model. Hence my suggestion of using a RDBMS backend with a spreadsheet frontend.

For lack of a better term (and a lack of coffee on my part) the "missing link" in your perception of the issue is that you are mixing the M, V, and C together in your mind because the spreadsheet appears to be doing them all at once, instantaneously, because of the GUI, while you are better realizing the three steps when working with a RDBMS because the workflow is more distinctly seperated into three steps of data entry, SQL query, and displaying the result.

I don't think the issue is one of keeping the dataset in memory vs on disk. I think it is simply a matter of workflow -- of how you perceive spreadsheet being more conducive to "playing" with data.

urlwolf

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,837
    • View Profile
    • Donate to Member
@MikeMcLoughlin: thanks, CSVed looks very good.
@tinjaw: good analogy; never thought of it that way. Right now,
data/model is mysql. which I have to dump into huge text files because I could not get the RMySQL package to work;
View/display The weak point of R. I miss conditional formatting in excel, quick sorts etc. Done in R, dumping to text files or clipboard and pasting into excel
Controller/Calulations R. Doing a good job.

I guess I just want the best of all paradigms and I cannot get that.

kfitting

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 593
    • View Profile
    • Donate to Member
One problem I have with databases is that it takes a fair amount of time to design and write a frontend... I like spreadsheets because of the inherent GUI.  I really do wish there was a better way to mix the two ideas.  I don't have time to write a GUI for every little database I would like to make!

Kevin

tinjaw

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,927
    • View Profile
    • Donate to Member
I'm not claiming it is a panacea, but it is something you should check out if you haven't already.

Resolver One is a program that blends a familiar spreadsheet-like interface with the powerful Python programming language, giving you a tool with which to better analyze and present your data.

urlwolf

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,837
    • View Profile
    • Donate to Member
I tried resolver one; not very stable, and chokes on largish files. A pity, because the idea is very good.

PPLandry

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 702
    • View Profile
    • InfoQube Information manager
    • Read more about this member.
    • Donate to Member
I'm the designer of InfoQube, which sits exactly in between a database and a spreadsheet. You can have millions of rows, but it is easy to extract a small subset. The front end UI is that of a outlining grid (+1 over Excel) and the back end is a database (JET 4.0). Adding fields to your tables is easy. You can color rows, outline rows into a hierarchy and add extra rich text info for each row. Supports built-in VB Script functions and user-defined ones too. Plus, if you've got a version of MSOffice installed, you can use pivot tables and charts (i.e. OLAP cubes) to analyze / summarize your data. Data in your IQubes can be viewed live in Excel and Word, which is great to share with others.

www.InfoQube.biz
Real generosity toward the future lies in giving all to the present -- Albert Camus -- www.InfoQube.biz

tinjaw

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,927
    • View Profile
    • Donate to Member
Ah! I forgot about SQLNotes InfoQube !! I should revisit it.  :up: