topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 4:11 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: My text file manipulation needs  (Read 37954 times)

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
My text file manipulation needs
« on: October 12, 2009, 06:56 AM »
Hallo!
I am a writer and not a programmer, but am in search of easy to use tools to tackle simple tasks.
They are almost always associated with text files & a tagged CSV English - German dictionary.
For the tasks I have I would like to try and use AutoHotKey. I seem to have no luck with programming languages.
The task I am trying to perform is:
1. Open a simple text file of sentences.
2. Open a CSV dictionary file
3. Read words in turn from the text file, find them each in the dictionary and
4. Copy the Parts-of-Speech for each from dictionary
5. Write the tags in brackets following each word in the text file.
Is it possible to do this uising AutoHotKey?
Has anyne tried it or done it?
Can he or she help me?
Or has one of the donation-coders a better way?
I'd be very grateful.
Thanks and regards, forkinpm.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #1 on: October 12, 2009, 07:38 AM »
Can you provide some sample files?

cmpm

  • Charter Member
  • Joined in 2006
  • ***
  • default avatar
  • Posts: 2,026
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #2 on: October 12, 2009, 08:04 AM »
http://translateclient.com/

It sets in the tray waiting to be activated for programs.
Not sure if it will work with whatever program you use.

A google product.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #3 on: October 12, 2009, 10:47 AM »
Hallo!
and
Thanks for the trouble you took to reply.
None, or almost none of the on-line translation tools are adequately tagged and thus of no use to me as translation tools.
Only someone writing in two languages or doing translations will understand the implications of such.
Most of the products have really little value as translation aids..
For this reason I am developing an approach to creating my own.
Thanks anyway and kind regards, forkinpm.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #4 on: October 12, 2009, 10:49 AM »
Hallo again.
The message I just sent was not very well written and had typographical mistakes.
Are you able to live with it or should I send it again.
Regards, forkinpm.

Paul Keith

  • Member
  • Joined in 2008
  • **
  • Posts: 1,989
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #5 on: October 12, 2009, 11:33 AM »
I am not a programmer either and I also don't really understand things like CSVs but from the way you use certain keywords, I am reminded by SRS (Spaced Repetition Software)

With Anki being the best desktop program I know of:

http://ichi2.net/anki/screencast2.html

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #6 on: October 12, 2009, 06:32 PM »
Again, please, can you provide your CSV file and a sample sentence file?

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #7 on: October 13, 2009, 05:15 AM »
Hallo!
Thank you to those who replied.
This afternoon I will post a note with a zip attached (if I can).
The zip will contain the following files:
1. A text file of sentences.
2. A nother file listing the words used and their usage frequency
3. A CSV file with two columns separated by a bar |
Column 1 will list the words used in the text file
Column 2 will list the Parts-of-speech tags in bracket typically as {N s}
4. A file list the parts of sppech names and their tags.
Thanks and until later this afternoon.
Regards, forkinpm.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs - the promised zip-file.
« Reply #8 on: October 13, 2009, 09:03 AM »
Hallo!
As promised, attached to this email is a zip of the files for the 'test'.
Good luck.
Hope that it is understandable; if not drop me a note,
Regards, forkinpm.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #9 on: October 14, 2009, 04:32 AM »
The next question is how do you want this output to look?  Were you expecting something like this?

Sentence: Organized Man is the Motor for Evolution.
Output: Organized {VB} Man {N s} is {VB} the {DA} Motor {N s} for {PR} Evolution {N s}.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #10 on: October 15, 2009, 05:46 AM »
Hallo "skwire"!
Yes, that is more or less, what I am looking for.
How can I acquire a copy of the program?
What language is it written in?
and
How can I enter into a dialogue with you on what comes after this first step?
Thanks for the first shot.
Regards, forkinpm.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #11 on: October 15, 2009, 06:20 AM »
I haven't written anything yet but I do have some more thoughts and questions.

1) The CSV file in your zip isn't really a CSV file.  It's a simple text file that wrongly has a CSV extension.
2) I'm envisioning an app with two entry boxes.  The first one would be for the sentence file and the second would be for the word list file (the CSV file in your zip).
3) Output could be a standard save window for the resulting output text file, or, I could display the output in an edit field in the main app.

Is this what you had in mind?

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #12 on: October 15, 2009, 01:14 PM »
Hallo skwire!
I will read your reply again in the morning and respond to the questions. My motivation is to have as simple a workspace as possible because there are some follow-on steps.
When could you have some code I could use to test and when will I know in which language you will write and what I will be expected to pay??
Regards and thanks again, forkinpm.

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,959
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #13 on: October 15, 2009, 02:59 PM »
[...]  in which language you will write and what I will be expected to pay??
Hi forkinpm :)
I hope Skwire wont mind me commenting:
you could have a look in the 'Post new requests'/'Coding snacks' board to get an idea of how the process might work
e.g. this thread:-
https://www.donation...ex.php?topic=19770.0

Autohotkey (AHK) is most often used to create scripts/programmes. It's portable, you can run a 'script' using AHK; or an .exe file can be made from the script - again that is portable too (the .exe file runs the programme as opposed to installing anything).

This site normally works on the idea of donations - if you use/like a programme or get something custom 'made' the idea is that you donate to the author. This is done by first donating via the last link on the right top of page. You get one credit per dollar donated. These are in your 'account' here. You can then donate 'credits' to the author via the $ icon under his/her avatar on the left. Anyone can cash in their credits.
As to amount - that's more delicate -  :-\ :D - I guess that depends on how much you can afford, how much work is done; how much worth something has... you could also donate to the site if you wanted. There's more info at that donate link.

Hope that helps
Tom

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #14 on: October 15, 2009, 04:04 PM »
My motivation is to have as simple a workspace as possible because there are some follow-on steps.
When could you have some code I could use to test and when will I know in which language you will write and what I will be expected to pay??

Give this a try (source/binary included):  Speech Type Tagger download

2009-10-15_160855.pngMy text file manipulation needs

It's written in AutoHotkey.  Obviously, it can be improved and added on to but I wanted to make sure I was going in the right direction for you.

I hope Skwire wont mind me commenting:

Nope, I don't mind, my friend.  =]
« Last Edit: October 15, 2009, 05:03 PM by skwire »

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #15 on: October 17, 2009, 03:57 AM »
Hallo skwire, or
can I call you Jody?
I have looked at and can find some affinity to (or is it with) the ahk script.
It provided a usable answer to the first of my topics. It can also be used as an exe.
Related to the script I have the following questions and comments:
1. What does this first script cost?
2. How do I pay for it?
3. Is the payment to you or DC using PayPal?
4. The task is the first of a series of 5 + steps to create a simple translation platform from English to German.
5. How could we deal with the rest?
6. French Italian and Spanish would hopefully follow.
7. Can I have some choice in the tool or language used?
8. I began with the desire to use a language that is close to natural English.
9. The Zeno interpreter followed by C-Sharp would be my choices; before ahk.
10. I could grow to like ahk if I can write scripts using it.
Over this weekend I will write a document which relates to the model and attempts to provide answers to the above.
I would like to have your response first.
Regards and have a nice week-end. forkinpm.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs - a new post from me on 17.10.09
« Reply #16 on: October 17, 2009, 06:17 AM »
Hallo again today!
I am just adding comments as my evaluation proceeds.
I am not sure if the reply button is the right one to introduce a new post.
My analysis today:
The initial test, with the same file, as I had provided you with, produced the following results.
* No words were tagged, which were followed directly by a punctuation mark.
So I removed all punctuation marks from the text and reran the script with the following result
* In most cases initial words and closing words of sentences were still not being tagged.
Can it be that "invisible" end-of-line characters are causing the problem.
Where there are instances of the same word appearing with different tags, how will the script react?
It should have three possibilities:
1. Ignore all because of failing logic
2. Take the first as being the most typical or
3. Take each variant inserting each in turn in separate brackets after the word.
This is the way in which I expect it to be programmed.
My last file test was based on using UTF8 coded text.
I have looked at the script source code in the hope that I would find a clue, but to no avail. I also am left
with the feeling that the code is not simple and might even be described as confuse.
Can you please indicate where the answer might lie?
Again regards and have a nice week-end. forkinpm.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #17 on: October 17, 2009, 11:35 AM »
can I call you Jody?

Sure, it's my name.  =]

1. What does this first script cost?
2. How do I pay for it?
3. Is the payment to you or DC using PayPal?

No payment is required.  You can click the coin beneath my forum nick if you feel inclined to donate.

4. The task is the first of a series of 5 + steps to create a simple translation platform from English to German.
5. How could we deal with the rest?
6. French Italian and Spanish would hopefully follow.

This sounds like it's quickly becoming more than a Coding Snack.

7. Can I have some choice in the tool or language used?
8. I began with the desire to use a language that is close to natural English.
9. The Zeno interpreter followed by C-Sharp would be my choices; before ahk.

The programmers here at DonationCoder are not hired or retained in any way.  We're just a group that enjoy programming and helping others.   In other words, I'm not sure how to answer this question.  If you want something written in a certain language, feel free to say so.  Maybe somebody else reading this thread will jump in.  I can program in C, Java, AutoHotkey and a little bit of C++, VB and Python.  I've never used Zeno or any of the newer .NET or C# type of languages. 

The initial test, with the same file, as I had provided you with, produced the following results.
* No words were tagged, which were followed directly by a punctuation mark.
So I removed all punctuation marks from the text and reran the script with the following result
* In most cases initial words and closing words of sentences were still not being tagged.
Can it be that "invisible" end-of-line characters are causing the problem.

I'm confused.  Are you saying that the original files in the zip you posted earlier are not working with the script I provided?

Where there are instances of the same word appearing with different tags, how will the script react?
It should have three possibilities:
1. Ignore all because of failing logic
2. Take the first as being the most typical or
3. Take each variant inserting each in turn in separate brackets after the word.
This is the way in which I expect it to be programmed.

I can write code to handle any of the three options you mention.

My last file test was based on using UTF8 coded text.

This could pose a problem since AutoHotkey does not support Unicode/UTF-8 natively.  Can you provide me with the files you're working with?

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation need - post 18.10.2009
« Reply #18 on: October 18, 2009, 06:49 AM »
Hallo Jody!
With this post I am resending the files I originally sent you and from which you developed the ahk script. The files were also used as the material which supported the release of the script.
You will see that it was in fact a not fully tested script. The comments I made about the script not functioning were valid.
The zipped version of the files, I am resending you as an attachment to this post.
I have also tested that the text file is an ascii non-formatted text file.
I have test with ascii, ansi and utf8 file versions, also written with different text editors
I have prepared a more complete version of what I want and when you are ready I can forward it to you.
Regards, forkinpm.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #19 on: October 18, 2009, 07:18 PM »
Please try this version: Download v1.0.0.5

It should handle first words, last words & multiple speech type matches.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #20 on: October 19, 2009, 08:30 AM »
Hallo Jody!
Thank you for the prompt response.
I'll try the new version and get back to you.
Thanks again, forkinpm.

gpetrant

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 65
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: My text file manipulation needs
« Reply #21 on: October 19, 2009, 09:10 AM »
I don't think your tasks are as simple as they might seem, but there is a commercial product which can easily handle all of them:  Textpipe  However, it's expensive and has a relatively steep learning curve (read: how familiar are you with regular expressions?).  My suggestion: if your task at hand is a 'one shot deal' (read: only needs to be done once and that's it), then go for the excellent solutions offered here.  If not, consider the onetime purchase of Textpipe an investment which will pay off in spades if you're planning on doing more text manipulation work in the future.  (And, no, I'm not affiliated with them whatsoever; I'm just a customer.)   
Shywolf

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs - the new version
« Reply #22 on: October 20, 2009, 06:20 AM »
Hallo Jody!
Let me begin with a thank you.
Next, there are still a few wrinkles with commas, apostrophe 's and a colon.
Lastly, there are more additional tags than I had expected, but the decision was the right on.
I am reworking the sentence text to remove the wrinkles and additional keywords and building the next steps.
The script will have only minor changes and the next script will have as its first task removing the additional tags.
I would like to do it with user interaction. You will see.
The new script tasks will be:
1. Segmenting the English sentences
2. Resequencing into the German order and
3. Adding the head-word markers for translation.
The result of this step will be the pre-final version of the script.
The final version will be the translation step. That says I must have the dictionary ready or at least a test version which covers all of the words in the English texts and their German equivalents.
The definition I am putting together now will be ready by mid-day tomorrow Wednesday 21 October.
The test vserion of the dictionary to produce a relatively complete translation model script, I will do my utmost to have ready 7 days later.
Regards, forkinpm.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs - MNy deadline of midday today
« Reply #23 on: October 21, 2009, 11:52 AM »
Hallo Jody!
I had promised to let you have the document by midday today, but I am still not ready.
I will work on it until moidday tomorrow and then send it, with a statement as to when the full document will be ready.
I do apologize that it has taken longer than promised.
Regards, forkinpm.

forkinpm

  • Participant
  • Joined in 2009
  • *
  • default avatar
  • Posts: 24
    • View Profile
    • Donate to Member
Re: My text file manipulation needs
« Reply #24 on: October 22, 2009, 09:31 AM »
Hallo Jody!
I had promiserd to give yopu this by midday today, but better late than never.
There is a zip with four files attached.
Though the work is still not complete, you will see from the files that it is getting there.
The 4 files are:
1. NewTestWordsTags.rtf listing the tagged sentences and showing two issues
1.1. Simple changes to reflect taking {'s}, {:} and a missing word {cities} into account
1.2. The additional word tags, which will require a step to remove them.
Perhaps this can be accommodated in the segmentation step.
2. A note on the steps.
3. TestResult.rtf indicating how I am taking the tagged sentences through the next steps. I will add my comments when I send a final update tomorrow.
4. Notation+SentenceMods1.html.
You will see how the rules are being defined on the basis of a logical notation for each segment.
So good luck until tomorrow.
Thanks and regards, forkinpm.