Welcome Guest.   Make a donation to an author on the site April 24, 2014, 06:15:55 AM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
Your Support Funds this Site: View the Supporter Yearbook.
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: [1] 2 Next   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: My text file manipulation needs  (Read 13482 times)
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« on: October 12, 2009, 06:56:06 AM »

Hallo!
I am a writer and not a programmer, but am in search of easy to use tools to tackle simple tasks.
They are almost always associated with text files & a tagged CSV English - German dictionary.
For the tasks I have I would like to try and use AutoHotKey. I seem to have no luck with programming languages.
The task I am trying to perform is:
1. Open a simple text file of sentences.
2. Open a CSV dictionary file
3. Read words in turn from the text file, find them each in the dictionary and
4. Copy the Parts-of-Speech for each from dictionary
5. Write the tags in brackets following each word in the text file.
Is it possible to do this uising AutoHotKey?
Has anyne tried it or done it?
Can he or she help me?
Or has one of the donation-coders a better way?
I'd be very grateful.
Thanks and regards, forkinpm.
Logged
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #1 on: October 12, 2009, 07:38:55 AM »

Can you provide some sample files?
Logged

cmpm
Charter Member
***
Posts: 2,020

View Profile Give some DonationCredits to this forum member
« Reply #2 on: October 12, 2009, 08:04:49 AM »

http://translateclient.com/

It sets in the tray waiting to be activated for programs.
Not sure if it will work with whatever program you use.

A google product.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #3 on: October 12, 2009, 10:47:36 AM »

Hallo!
and
Thanks for the trouble you took to reply.
None, or almost none of the on-line translation tools are adequately tagged and thus of no use to me as translation tools.
Only someone writing in two languages or doing translations will understand the implications of such.
Most of the products have really little value as translation aids..
For this reason I am developing an approach to creating my own.
Thanks anyway and kind regards, forkinpm.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #4 on: October 12, 2009, 10:49:34 AM »

Hallo again.
The message I just sent was not very well written and had typographical mistakes.
Are you able to live with it or should I send it again.
Regards, forkinpm.
Logged
Paul Keith
Member
**
Posts: 1,965


see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #5 on: October 12, 2009, 11:33:55 AM »

I am not a programmer either and I also don't really understand things like CSVs but from the way you use certain keywords, I am reminded by SRS (Spaced Repetition Software)

With Anki being the best desktop program I know of:

http://ichi2.net/anki/screencast2.html
Logged

<reserve space for the day DC can auto-generate your signature from your personal PopUp Wisdom quotes>
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #6 on: October 12, 2009, 06:32:48 PM »

Again, please, can you provide your CSV file and a sample sentence file?
Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #7 on: October 13, 2009, 05:15:56 AM »

Hallo!
Thank you to those who replied.
This afternoon I will post a note with a zip attached (if I can).
The zip will contain the following files:
1. A text file of sentences.
2. A nother file listing the words used and their usage frequency
3. A CSV file with two columns separated by a bar |
Column 1 will list the words used in the text file
Column 2 will list the Parts-of-speech tags in bracket typically as {N s}
4. A file list the parts of sppech names and their tags.
Thanks and until later this afternoon.
Regards, forkinpm.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #8 on: October 13, 2009, 09:03:34 AM »

Hallo!
As promised, attached to this email is a zip of the files for the 'test'.
Good luck.
Hope that it is understandable; if not drop me a note,
Regards, forkinpm.

* TextTest.zip (3.38 KB - downloaded 230 times.)
Logged
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #9 on: October 14, 2009, 04:32:31 AM »

The next question is how do you want this output to look?  Were you expecting something like this?

Sentence: Organized Man is the Motor for Evolution.
Output: Organized {VB} Man {N s} is {VB} the {DA} Motor {N s} for {PR} Evolution {N s}.
Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #10 on: October 15, 2009, 05:46:31 AM »

Hallo "skwire"!
Yes, that is more or less, what I am looking for.
How can I acquire a copy of the program?
What language is it written in?
and
How can I enter into a dialogue with you on what comes after this first step?
Thanks for the first shot.
Regards, forkinpm.
Logged
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #11 on: October 15, 2009, 06:20:51 AM »

I haven't written anything yet but I do have some more thoughts and questions.

1) The CSV file in your zip isn't really a CSV file.  It's a simple text file that wrongly has a CSV extension.
2) I'm envisioning an app with two entry boxes.  The first one would be for the sentence file and the second would be for the word list file (the CSV file in your zip).
3) Output could be a standard save window for the resulting output text file, or, I could display the output in an edit field in the main app.

Is this what you had in mind?
Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #12 on: October 15, 2009, 01:14:57 PM »

Hallo skwire!
I will read your reply again in the morning and respond to the questions. My motivation is to have as simple a workspace as possible because there are some follow-on steps.
When could you have some code I could use to test and when will I know in which language you will write and what I will be expected to pay??
Regards and thanks again, forkinpm.
Logged
tomos
Charter Member
***
Posts: 8,067



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #13 on: October 15, 2009, 02:59:10 PM »

[...]  in which language you will write and what I will be expected to pay??
Hi forkinpm smiley
I hope Skwire wont mind me commenting:
you could have a look in the 'Post new requests'/'Coding snacks' board to get an idea of how the process might work
e.g. this thread:-
http://www.donationcoder....m/index.php?topic=19770.0

Autohotkey (AHK) is most often used to create scripts/programmes. It's portable, you can run a 'script' using AHK; or an .exe file can be made from the script - again that is portable too (the .exe file runs the programme as opposed to installing anything).

This site normally works on the idea of donations - if you use/like a programme or get something custom 'made' the idea is that you donate to the author. This is done by first donating via the last link on the right top of page. You get one credit per dollar donated. These are in your 'account' here. You can then donate 'credits' to the author via the $ icon under his/her avatar on the left. Anyone can cash in their credits.
As to amount - that's more delicate -  undecided cheesy - I guess that depends on how much you can afford, how much work is done; how much worth something has... you could also donate to the site if you wanted. There's more info at that donate link.

Hope that helps
Logged

Tom
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #14 on: October 15, 2009, 04:04:30 PM »

My motivation is to have as simple a workspace as possible because there are some follow-on steps.
When could you have some code I could use to test and when will I know in which language you will write and what I will be expected to pay??

Give this a try (source/binary included):  Speech Type Tagger download



It's written in AutoHotkey.  Obviously, it can be improved and added on to but I wanted to make sure I was going in the right direction for you.

I hope Skwire wont mind me commenting:

Nope, I don't mind, my friend.  =]
« Last Edit: October 15, 2009, 05:03:54 PM by skwire » Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #15 on: October 17, 2009, 03:57:01 AM »

Hallo skwire, or
can I call you Jody?
I have looked at and can find some affinity to (or is it with) the ahk script.
It provided a usable answer to the first of my topics. It can also be used as an exe.
Related to the script I have the following questions and comments:
1. What does this first script cost?
2. How do I pay for it?
3. Is the payment to you or DC using PayPal?
4. The task is the first of a series of 5 + steps to create a simple translation platform from English to German.
5. How could we deal with the rest?
6. French Italian and Spanish would hopefully follow.
7. Can I have some choice in the tool or language used?
8. I began with the desire to use a language that is close to natural English.
9. The Zeno interpreter followed by C-Sharp would be my choices; before ahk.
10. I could grow to like ahk if I can write scripts using it.
Over this weekend I will write a document which relates to the model and attempts to provide answers to the above.
I would like to have your response first.
Regards and have a nice week-end. forkinpm.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #16 on: October 17, 2009, 06:17:57 AM »

Hallo again today!
I am just adding comments as my evaluation proceeds.
I am not sure if the reply button is the right one to introduce a new post.
My analysis today:
The initial test, with the same file, as I had provided you with, produced the following results.
* No words were tagged, which were followed directly by a punctuation mark.
So I removed all punctuation marks from the text and reran the script with the following result
* In most cases initial words and closing words of sentences were still not being tagged.
Can it be that "invisible" end-of-line characters are causing the problem.
Where there are instances of the same word appearing with different tags, how will the script react?
It should have three possibilities:
1. Ignore all because of failing logic
2. Take the first as being the most typical or
3. Take each variant inserting each in turn in separate brackets after the word.
This is the way in which I expect it to be programmed.
My last file test was based on using UTF8 coded text.
I have looked at the script source code in the hope that I would find a clue, but to no avail. I also am left
with the feeling that the code is not simple and might even be described as confuse.
Can you please indicate where the answer might lie?
Again regards and have a nice week-end. forkinpm.
Logged
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #17 on: October 17, 2009, 11:35:04 AM »

can I call you Jody?

Sure, it's my name.  =]

1. What does this first script cost?
2. How do I pay for it?
3. Is the payment to you or DC using PayPal?

No payment is required.  You can click the coin beneath my forum nick if you feel inclined to donate.

4. The task is the first of a series of 5 + steps to create a simple translation platform from English to German.
5. How could we deal with the rest?
6. French Italian and Spanish would hopefully follow.

This sounds like it's quickly becoming more than a Coding Snack.

7. Can I have some choice in the tool or language used?
8. I began with the desire to use a language that is close to natural English.
9. The Zeno interpreter followed by C-Sharp would be my choices; before ahk.

The programmers here at DonationCoder are not hired or retained in any way.  We're just a group that enjoy programming and helping others.   In other words, I'm not sure how to answer this question.  If you want something written in a certain language, feel free to say so.  Maybe somebody else reading this thread will jump in.  I can program in C, Java, AutoHotkey and a little bit of C++, VB and Python.  I've never used Zeno or any of the newer .NET or C# type of languages. 

The initial test, with the same file, as I had provided you with, produced the following results.
* No words were tagged, which were followed directly by a punctuation mark.
So I removed all punctuation marks from the text and reran the script with the following result
* In most cases initial words and closing words of sentences were still not being tagged.
Can it be that "invisible" end-of-line characters are causing the problem.

I'm confused.  Are you saying that the original files in the zip you posted earlier are not working with the script I provided?

Where there are instances of the same word appearing with different tags, how will the script react?
It should have three possibilities:
1. Ignore all because of failing logic
2. Take the first as being the most typical or
3. Take each variant inserting each in turn in separate brackets after the word.
This is the way in which I expect it to be programmed.

I can write code to handle any of the three options you mention.

My last file test was based on using UTF8 coded text.

This could pose a problem since AutoHotkey does not support Unicode/UTF-8 natively.  Can you provide me with the files you're working with?
Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #18 on: October 18, 2009, 06:49:46 AM »

Hallo Jody!
With this post I am resending the files I originally sent you and from which you developed the ahk script. The files were also used as the material which supported the release of the script.
You will see that it was in fact a not fully tested script. The comments I made about the script not functioning were valid.
The zipped version of the files, I am resending you as an attachment to this post.
I have also tested that the text file is an ascii non-formatted text file.
I have test with ascii, ansi and utf8 file versions, also written with different text editors
I have prepared a more complete version of what I want and when you are ready I can forward it to you.
Regards, forkinpm.

* TextTest.zip (3.38 KB - downloaded 187 times.)
Logged
skwire
Charter Member
***
Posts: 3,911



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #19 on: October 18, 2009, 07:18:46 PM »

Please try this version: Download v1.0.0.5

It should handle first words, last words & multiple speech type matches.
Logged

forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #20 on: October 19, 2009, 08:30:47 AM »

Hallo Jody!
Thank you for the prompt response.
I'll try the new version and get back to you.
Thanks again, forkinpm.
Logged
gpetrant
Supporting Member
**
Posts: 65


View Profile Read user's biography. Give some DonationCredits to this forum member
« Reply #21 on: October 19, 2009, 09:10:33 AM »

I don't think your tasks are as simple as they might seem, but there is a commercial product which can easily handle all of them:  Textpipe  However, it's expensive and has a relatively steep learning curve (read: how familiar are you with regular expressions?).  My suggestion: if your task at hand is a 'one shot deal' (read: only needs to be done once and that's it), then go for the excellent solutions offered here.  If not, consider the onetime purchase of Textpipe an investment which will pay off in spades if you're planning on doing more text manipulation work in the future.  (And, no, I'm not affiliated with them whatsoever; I'm just a customer.)   
Logged

Shywolf
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #22 on: October 20, 2009, 06:20:59 AM »

Hallo Jody!
Let me begin with a thank you.
Next, there are still a few wrinkles with commas, apostrophe 's and a colon.
Lastly, there are more additional tags than I had expected, but the decision was the right on.
I am reworking the sentence text to remove the wrinkles and additional keywords and building the next steps.
The script will have only minor changes and the next script will have as its first task removing the additional tags.
I would like to do it with user interaction. You will see.
The new script tasks will be:
1. Segmenting the English sentences
2. Resequencing into the German order and
3. Adding the head-word markers for translation.
The result of this step will be the pre-final version of the script.
The final version will be the translation step. That says I must have the dictionary ready or at least a test version which covers all of the words in the English texts and their German equivalents.
The definition I am putting together now will be ready by mid-day tomorrow Wednesday 21 October.
The test vserion of the dictionary to produce a relatively complete translation model script, I will do my utmost to have ready 7 days later.
Regards, forkinpm.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #23 on: October 21, 2009, 11:52:26 AM »

Hallo Jody!
I had promised to let you have the document by midday today, but I am still not ready.
I will work on it until moidday tomorrow and then send it, with a statement as to when the full document will be ready.
I do apologize that it has taken longer than promised.
Regards, forkinpm.
Logged
forkinpm
Participant
*
Posts: 24

View Profile Give some DonationCredits to this forum member
« Reply #24 on: October 22, 2009, 09:31:59 AM »

Hallo Jody!
I had promiserd to give yopu this by midday today, but better late than never.
There is a zip with four files attached.
Though the work is still not complete, you will see from the files that it is getting there.
The 4 files are:
1. NewTestWordsTags.rtf listing the tagged sentences and showing two issues
1.1. Simple changes to reflect taking {'s}, {:} and a missing word {cities} into account
1.2. The additional word tags, which will require a step to remove them.
Perhaps this can be accommodated in the segmentation step.
2. A note on the steps.
3. TestResult.rtf indicating how I am taking the tagged sentences through the next steps. I will add my comments when I send a final update tomorrow.
4. Notation+SentenceMods1.html.
You will see how the rules are being defined on the basis of a logical notation for each segment.
So good luck until tomorrow.
Thanks and regards, forkinpm.
Logged
Pages: [1] 2 Next   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.041s | Server load: 0.09 ]