Welcome Guest.   Make a donation to an author on the site October 19, 2014, 11:34:08 PM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
The N.A.N.Y. Challenge 2014! Download dozens of custom programs!
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: [1]   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: Idea - Return *new* lines from a text file (w/ example)  (Read 5480 times)
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« on: July 07, 2010, 09:28:12 AM »

I would love for a proper version of this function I found which returns *solely* the new lines from a text/log file. The first call gets the file size and the second call will return only the text that has been added since the first call. Im not sure if there is any other information someone would need from me to make this happen but I would be more than happy to answer any questions to the best of my ability.  

I have a semi-working function of this that was written in AHK. The reason I say semi-working is because in my script this function is called very often and after a couple hundred calls it causes the local machine to hard-lock.

P.S. - The original author of this function (denick) has since removed the code from his post because of the issue I mentioned above.

[copy or print]
;Returns new lines from a text file.  First call gets FileSize, 2nd+ call gets new lines.  Converted German to English.
FileTail(File) ;;by denick (http://de.autohotkey.com/forum/topic4619.html)
{
   Static OPEN_EXISTING := 3
   Static GENERIC_READ := 0x80000000
   Static FILE_SHARE_READ := 1
   Static FILE_SHARE_WRITE := 2
   Static FILE_SHARE_DELETE := 4
   Static FILE_BEGIN := 0
   Static INVALID_HANDLE_VALUE := -1
   Static CF := ""   ; Aktuelle Datei (Current File)
   Static FP := 0    ; Aktuelle Position in der Datei (File Pointer)
   FH := 0           ; Dateihandle (File Handle)
   FS := 0           ; DateigrĂ¶ĂŸe (File Size)
   FC := ""          ; Inhalt der gelesenen Zeilen (File Content)
   CL := 0           ; LĂ€nge des gelesenen Zeilen (Content Length)
   If (File != CF) {
      CF := File, FP := 0
   }
   FH := DllCall("CreateFile"
               , "Str",  File
               , "UInt", GENERIC_READ
               , "UInt", FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE
               , "UInt", 0
               , "UInt", OPEN_EXISTING
               , "UInt", 0
               , "UInt", 0)
   If (FH = INVALID_HANDLE_VALUE) {
      CF := "", FP := 0
      MsgBox, 262160, FileTail, The file *%File%* could not be opened!
   } Else {
      DllCall("GetFileSizeEx"
            , "UInt",   FH
            , "Int64P", FS)
      If (FP = 0 Or FS <= FP) {
         FP := FS
      } Else {
         DllCall("SetFilePointerEx"
               , "UInt",  FH
               , "Int64", FP
               , "UInt",  0
               , "UInt",  FILE_BEGIN)
         CL := VarSetCapacity(FC, FS - FP, 0)
         DllCall("ReadFile"
               , "UInt",  FH
               , "UInt",  &FC
               , "UInt",  CL
               , "UIntP", CL
               , "UInt",  0)
         VarSetCapacity(FC, -1)
         FP := FS
      }
      DllCall("CloseHandle", "UInt", FH)
   }
   Return FC
}
« Last Edit: July 07, 2010, 09:29:49 AM by strictlyfocused02 » Logged
daddydave
Supporting Member
**
Posts: 818



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #1 on: July 07, 2010, 09:54:30 AM »

Point of clarification: Are the new lines guaranteed to be at the end of the file?
Logged
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #2 on: July 07, 2010, 09:55:54 AM »

Point of clarification: Are the new lines guaranteed to be at the end of the file?
Yes
Logged
daddydave
Supporting Member
**
Posts: 818



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #3 on: July 07, 2010, 11:46:07 AM »

Not sure if you want to do it this way (in fact I'm pretty sure you don't), but it looks like it can be done from a batch file:


if not exist file.txt exit
fc oldfile.txt file.txt | find /v "*****"
copy file.txt oldfile.txt
pause


« Last Edit: July 07, 2010, 11:51:04 AM by daddydave » Logged
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #4 on: July 07, 2010, 11:54:18 AM »

Not sure if you want to do it this way, but it looks like it can be done from a batch file:


if not exist file.txt exit
fc oldfile.txt file.txt | find /v "*****"
copy file.txt oldfile.txt
pause




I cant copy the files. The log files being read can sometimes be well over 10mb so a copy based solutions isnt pratical.
Logged
daddydave
Supporting Member
**
Posts: 818



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #5 on: July 07, 2010, 12:18:29 PM »

I cant copy the files. The log files being read can sometimes be well over 10mb so a copy based solutions isnt pratical.

That makes sense. That source code you posted is way over my head and someone else will need to chime in smiley, but is the gist of it that the function monitors the file size, so that if it was n bytes and now it is n + 1000 bytes, return the last 1000 bytes as lines?

« Last Edit: July 07, 2010, 12:20:25 PM by daddydave » Logged
CWuestefeld
Supporting Member
**
Posts: 934



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #6 on: July 08, 2010, 12:05:48 PM »

You just need the common tail utility.

I frequently use a Windows GUI-based version called BareTail, which is free and recommended.
Logged



daddydave
Supporting Member
**
Posts: 818



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #7 on: July 08, 2010, 12:10:36 PM »

You just need the common tail utility.


I thought of that, but I didn't think it was smart enough to only display the newest lines and not just the last N lines.
Logged
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #8 on: July 08, 2010, 02:55:05 PM »

You just need the common tail utility.

I frequently use a Windows GUI-based version called BareTail, which is free and recommended.
You just need the common tail utility.


I thought of that, but I didn't think it was smart enough to only display the newest lines and not just the last N lines.

daddydave is correct, tail performs a similar function to what Im looking for, but lacks in one area. The log could have 3 lines written to it or it could have 300 lines written to it, I would like to be able to retrieve and store solely those new lines.

Hopefully to save some time, I have attempted to use various line counting FileRead techniques in AutoHotkey but with my log files being potentially enormous this takes forever.
Logged
rjbull
Charter Member
***
Posts: 2,773

View Profile Give some DonationCredits to this forum member
« Reply #9 on: July 08, 2010, 03:39:25 PM »

Could you:

1) use a "tail" utility to save the last n lines
2) when you want to check what's changed, use "tail" again to save the new tail
3) use a "diff" utility to see the difference?

The UnxUtils project on SourceForge contains command-line versions of both "tail" and "diff."
Logged
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #10 on: July 08, 2010, 05:25:01 PM »

Could you:

1) use a "tail" utility to save the last n lines
2) when you want to check what's changed, use "tail" again to save the new tail
3) use a "diff" utility to see the difference?

The UnxUtils project on SourceForge contains command-line versions of both "tail" and "diff."

I suppose this could make this work, but it will definitely end up convoluted. I would probably have to tail the last 500 lines twice and then diff them which would really suck if only 2 lines have been added.  Even worse if more than 500 lines have been written a lot of info could get overlooked.

Even if someone more familiar with AutoHotkey were able to pinpoint where the function I posted is flawed that would be fine.
Logged
rjbull
Charter Member
***
Posts: 2,773

View Profile Give some DonationCredits to this forum member
« Reply #11 on: July 11, 2010, 11:07:56 AM »

Haven't go anything more helpful to add  Sad  I was expecting to set it up with a batch file that would save the end of old log, save the end of new log, diff, then delete the old tail and rename the new tail as the old tail.  But, I agree that you wouldn't know where to cut the tail from.  I assumed you'd have to use tail because diff wouldn't be able to cope with files that were in use and constantly added to.  If you could find a diff that coped with log files that were presumably locked, you could use it directly.

I don't know if ExamDiff Pro, the visual diff I use, can do this, but I can't see it in the Help file.  Maybe one of the others can, like Beyond Compare, the favourite on DC?
Logged
steeladept
Supporting Member
**
Posts: 1,056



Fettucini alfredo is macaroni & cheese for adults

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #12 on: July 13, 2010, 02:32:13 PM »

What about something simple like this:

1) Read backward through log file looking for EOF (whatever delimiter you define as the end of file)
2) Display out lines after last EOF.
3) Write new EOF at end.

Note that this should not be the standard EOF but rather some special symbol (string?) that won't ever be written in normal log files.  Maybe something in a foreign language with alternating case or something.  Whatever it is, make it its own line entry written by the parser to designate the end of the last file and read only from there forward.

Just my logic, coding it could be much worse.
Logged
markcramer
Supporting Member
**
Posts: 4

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #13 on: July 15, 2010, 12:47:09 AM »

Most tail implementations I've seen can also show just the last N bytes.

So save the current file size, then when you want to display the difference, subtract the old size from the current one and display just that count of bytes using tail?

If you need to roll the log file, just reset the saved size to 0.

Is there any reason this won't work?
Logged
CmputrAce
Charter Member
***
Posts: 8

View Profile Give some DonationCredits to this forum member
« Reply #14 on: July 16, 2010, 08:23:06 AM »

1. The calling script is written in what language?
2. Do you want a command-line app that just returns the new test to stdout (CON:)?

I have a few thoughts... if I can write a function that you can embed in your script, then keeping the file size between calls is much easier. If I write a quick DOS program to do this, then I'd build it this way:
if you call the program with only the file name, I will return the file size to you. If you call the program with the file name AND the file size, I will return the new file size, a newline, then the text beginning at the character+1 position you submitted as the second parameter to the call.

This should be pretty easy, IMO.
Logged
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #15 on: August 07, 2010, 08:56:40 AM »

1. The calling script is written in what language?
2. Do you want a command-line app that just returns the new test to stdout (CON:)?

I have a few thoughts... if I can write a function that you can embed in your script, then keeping the file size between calls is much easier. If I write a quick DOS program to do this, then I'd build it this way:
if you call the program with only the file name, I will return the file size to you. If you call the program with the file name AND the file size, I will return the new file size, a newline, then the text beginning at the character+1 position you submitted as the second parameter to the call.

This should be pretty easy, IMO.

1. The calling script is written in AHK
2. Command line going to to stdout would be *perfect*

Regarding your other thoughts on it, the DOS route would be ideal following the procedure you mention.

P.S. Sorry for the delayed response, crazy busy at work with a new promotion =D
Logged
MilesAhead
Member
**
Posts: 4,921



View Profile WWW Give some DonationCredits to this forum member
« Reply #16 on: August 07, 2010, 12:23:07 PM »

@strictlyfocused02 what machine are you running the script on? How much memory? If calling an external program that produces a file with only the tail lines is an option, and provided you have a fair amount of ram, it should be nearly trivial in AutoIt3.

Pseudo code should go something like:

Read .ini file to get previous number of lines in file
(first run would be zero, in which case the program would
only save the new line count to .ini file.  Exit code could
indicate what the program did or just use the .ini file
for IPC.)


Load entire file into an array with
_FileReadToArray()

now $array[0] has the current line count

if $currentCount > $prevCount then
_FileWriteFromArray("DiffTextFilename.txt",$prevCount + 1)

now display the diff text file however you want,
then save the new current count to .ini file.

If the log file is 10 or 20 MB still these days if you have a gig or 2 of ram it should be no big deal to read the whole file into an array.


« Last Edit: August 07, 2010, 12:35:14 PM by MilesAhead » Logged

"Genius is not knowing you can't do it that way."
- MilesAhead
skwire
Global Moderator
*****
Posts: 4,111



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #17 on: August 09, 2010, 08:54:47 AM »

From what I recall from some #ahk IRC conversations a while back, the original poster is calling this function four times per second.  Also, the file to be retrieved is over a network and not a local file.  strictlyfocused02, please correct me if I'm wrong.
Logged

strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #18 on: August 09, 2010, 09:07:17 AM »

From what I recall from some #ahk IRC conversations a while back, the original poster is calling this function four times per second.  Also, the file to be retrieved is over a network and not a local file.  strictlyfocused02, please correct me if I'm wrong.

Im impressed at your memeory, skwire smiley.  You are correct that it is being called four times a second, but its a local file.  The file can get quite large after a short while (100,000+ lines).
Logged
MilesAhead
Member
**
Posts: 4,921



View Profile WWW Give some DonationCredits to this forum member
« Reply #19 on: August 09, 2010, 11:51:29 AM »

Funny how information supplied later invalidates things. If all you are going to do is dump the tail to the console then just about anything that can move the file pointer back from the end x amount of bytes, dumps it to console, and stores the new file length should work.  Since it's guaranteed to be text it's already formatted with newlines etc.. supposedly.

btw how are you determining how often to call this function? Is there some notification or did somebody arbitrarily pick 250 ms sleep loop? Would it make a difference if you called it twice a second and just dumped more lines? Or is that part out of your control?

edit: I don't know what code you have to deal with or how much you can influence changing it, but it may be a better approach to test the log for modification 4 times a second, and just set a flag.. and do something to create the output maybe once every 5 seconds or 8 seconds or whatever depending if there's a pattern to the size of the appended data.  Shoving the stuff to output every time a line is stuck on seems kind of inefficient.  But I realize often one is stuck with something that can't be changed due to edicts from higher up. smiley  With the thing being over 10 MB if the disk is fragmented even if your tail publisher is written in assembler it might not be able to go to end of file, back up, grab a chunk and stick it to console, which usually has pretty slow display routines, and come back for another drink 250 ms later.  Not much in the way of toleration for delay.

edit2: another consideration is, if you are sending the output to the same hard-wired destination it may be simpler to just use some mechanism to T it at a lower level.  What is appended to the file goes to the screen immediately, the console scrolling taking care of old stuff disappearing etc..  if that's an option under your control.


« Last Edit: August 09, 2010, 12:14:37 PM by MilesAhead » Logged

"Genius is not knowing you can't do it that way."
- MilesAhead
f0dder
Charter Honorary Member
***
Posts: 8,774



[Well, THAT escalated quickly!]

see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #20 on: August 09, 2010, 12:25:14 PM »

Is this supposed to be a general-purpose utility, or "pretty specific"? Do you want the program to output new lines and exit, or would continually running & monitoring work? This all affects strategies smiley

If the program can run continuously, it can monitor changes and only read the file on change - and it can rate-limit this to whatever/second in case of often modified files, plus it can cache the "where'd I toddle off to last" in memory.

If you want run-dump-exit, there "where'd I toddle off to last" will have to be stored somewhere. If the file(s) will always be on NTFS partitions, ADS could be a good idea.
Logged

- carpe noctem
strictlyfocused02
Participant
*
Posts: 10

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #21 on: August 09, 2010, 12:31:31 PM »

Is this supposed to be a general-purpose utility, or "pretty specific"? Do you want the program to output new lines and exit, or would continually running & monitoring work? This all affects strategies smiley

If the program can run continuously, it can monitor changes and only read the file on change - and it can rate-limit this to whatever/second in case of often modified files, plus it can cache the "where'd I toddle off to last" in memory.

If you want run-dump-exit, there "where'd I toddle off to last" will have to be stored somewhere. If the file(s) will always be on NTFS partitions, ADS could be a good idea.

General purpose was what I was thinking.  Definitely run and exit though. 

CmputrAce's idea of the utility returning the file size to stdout on the first call and then on the second call if the filesize is supplied it would just return the new lines.  This would be easy enough to handle because you could just use whatever program\script is calling the utility (AHK in my case) to call and store the size and then just supply that size on the second call.
Logged
Pages: [1]   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.048s | Server load: 0.09 ]