ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

Separate Out STOCK Symbols From Large Text File

<< < (4/4)

skwire:
@skwire: What regex did you use, (if you did)?-4wd (August 28, 2017, 12:19 AM)
--- End quote ---

What ended up working the best was:

1. Loop through each line.
2. Crack each line on its spaces.
3. Evaluate each part with: [A-Z]+\b

It's not quite perfect, and I'm not sure it ever could be, but it seems to be close enough for the OP's work.

4wd:
I was trying to end up with something along the lines of:


--- Code: PowerShell ---(Get-Content -Path .\test.txt) -creplace "\b[^A-Z]+\b" " " | Set-Content -Path out.txt
Would have left spaces between the remaining terms to make output formatting easier ... wasn't getting the RegEx though.

That was the theory anyway  ;D

Now I think about it I could have split the input easily:

--- Code: PowerShell ---[regex]::Split((Get-Content -Path .\test.txt),'[\s,\.\(\)]') | %{ if($_ -cmatch('^[A-Z]+$')){Write-Host $_}}
Output is OK except for outputting SBA and SBAC and duplicated entries.

@skwire: How did you stop SBA being passed through since SBAC is the stock code, (considering codes occur both inside and outside of surrounding brackets)?

You must be checking for duplicated entries also since there are multiple occurrences of JD and QQQ.

Was something to blow a few cobwebs out of my head anyway  :P

Addendum:
Duplicates removed courtesy of http://www.secretgeek.net/ps_duplicates

--- Code: PowerShell ---$hash = @{};[regex]::Split((Get-Content -Path .\test.txt),'[\s,\.\(\)]') | %{if($_ -cmatch('^[A-Z]+$')){if($hash.$_ -eq $null) { $_ }; $hash.$_ = 1}} > .\output.txtOutputFWONA
FB
OLED
AAPL
SBA
SBAC
ABMD
BAC
DNB
CHK
HAL
XOM
RIG
DAL
AAL
GM
F
QQQ
TVIX
VIX
ETN
CSCO
JD
MOMO
BRCD

skwire:
You must be checking for duplicated entries also since there are multiple occurrences of JD and QQQ.-4wd (August 28, 2017, 08:23 PM)
--- End quote ---

Yes, at the end of the list processing, I do an alpha sort and filter out duplicates.

Navigation

[0] Message Index

[*] Previous page

Go to full version