ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

[Request] Tell me who said what first!

(1/3) > >>

vevola:
After a great experience with DonationCoder, I'm posting another request.

I have a series of transcribed conversations. Each text file has a series of lines which begin with an initial and a semicolon which correspond to who says what. I would like to see what words are used by one speaker before the the other speaker uses them, as well as other things like frequency and collocation.

So here's an example:

A: So, I really like all those dresses, especially this red and that green thing there.
B: Yeah, the red one is nice.
A: Which one are you gonna buy?
B: I'll get the red one.
--- End quote ---

Here's what I want to be able to get.

For A:
- [What words said first by A:]
   "red" was said first by A:
- [Collocation first occurrence]
   the first time A: said "red" was in line 1
- [Frequency for A:]
   A: said "red" a total of 1 times
- [Collocation for A:]
   B: said "red" in lines 2, 4
- [Frequency for B:]
   B: repeated "red" a total of 2 times
- [Collocation for B:]
   B: said "red" in line 2

For B:
- [What words said first by B:]
   "one" was said first by B:
- [Collocation first occurrence]
   the first time B: said "one" was in line 2
- [Frequency for B:]
   B: said "one" a total of 2 times
- [Collocation for B:]
   B: said "one" in lines 2, 4
- [Frequency for A:]
   A: repeated "one" a total of 1 times
- [Collocation for A:]
   A: said "one" in line 3

My conversations have 3 speakers though, which might make it trickier.

How I see this happening: If it's possible to isolate all lines which begin with A: or B:, I imagine it's relatively easy to make a word list which includes word frequency and collocation. Then you'd have to compare two of these lists (like A+B, B+C, A+C) and compare the line numbers of the first occurrence in each speaker by seeing which number is smaller (e.g. First occurrence "red": A: line 1; B: line 2 --> 1 is less than 2, hence A: said "red" before B).

Any suggestions? Volunteers? :)

skwire:
Are the match words ("red" and "one" in your examples) provided by the user? 

vevola:
Are the match words ("red" and "one" in your examples) provided by the user? 
-skwire (July 26, 2011, 08:03 AM)
--- End quote ---
Are the match words ("red" and "one" in your examples) provided by the user? 
-skwire (July 26, 2011, 08:03 AM)
--- End quote ---

No, That was just as an example! :)

The text files are a lot longer (about 2000 lines).

skwire:
So you want a report detailing EVERY word in your conversation file?   :huh:

vevola:
Every word would be ok too. I'm not sure which words to exclude as of yet, so all words might be easier.

Navigation

[0] Message Index

[#] Next page

Go to full version