avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • October 22, 2019, 06:10 AM
  • Proudly celebrating 13 years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - kalos [ switch to compact view ]

Pages: prev1 2 3 4 [5] 6 7 8 9 10 ... 70next
Why is it useless? It's exact representation apart from the fact that are more irrelevant text around.

The format of the data is like that (the only difference is that the data is multiline rather than single line as in this example):


So I want the output to be a csv like:
prod1; a; b
prod2; c; d

So I was thinking first a regex to highlight/save in a variable the first area of the text that belongs to a prod, which is the the first six lines (I cannot use the number of lines to distinguish them as they vary).
Then it would extract a and b from that variable by matching the specs and price regex 'within' prod1 variable, so that I can distinguish them from prod2.
And then loop to complete the conversion.

Hope this helps?

So my understanding is that I cannot search for a regex that will match "specs=.+?" or something because I won't be able to distinguish this for prod1, prod2, etc.
At the same time, I cannot match the regex "prod1.+specs=.+?" because I don't know the exact text for prod1 (it's an xml attribute that is called prodID, but the value can be anything).

Do you have any idea on how to process this?

If you are searching for a regex within a regex, 'You Are Doing It Wrong' (T).

You initial requirement was to find and extract content using a regex, but now you need parts of that regex to be split out? That can be done using a single regex, grouping the stuff you need to split out.
And for this whole exersize to make any sense, where is the variable part of the data to find? When searching for explicit text(s), a count would suffice...
Please provide a complete example, with actual data (not an entire file!), clearly marking the stuff you need to extract, of what you want to achieve, not how you think it could/should be solved.

Indeed, I now realised it!
I will try to provide an example in a bit.

$items - an arbitrarily named variable
=        - sign signifying equality

Thus $items now equals an array of files in the current folder that match *.txt
$items[0] = firstfile.txt
$items[1] = secondfile.txt

$items.Count  - total number of matching files found

for(){}   - a for loop, $i is a variable that gets incremented by 1 every loop until the total number of matching files is reached

Thus loop through all the files in the array performing the following on every file:

Select-String -Path $items[$i] -Pattern $regex -AllMatches

Search each file for matching RegEx pattern, get all matches.

| % { $_.Matches } | % { $_.Value } >> $outfile

RegEx matches are piped into a ForEach loop, (shorthand notation). For each regex match, pipe it's value to the output file in append mode.

Don't actually need to escape the " in the RegEx either:
Code: PowerShell [Select]
  1. $regex = '<dsf:tsdfgd trsdfge="urn:x-ssdfgs-dfg-com:isdfgc/tg4r3e-i4d" id="OsdfgsdfD">'
Will also work.

Same as the 6 lines above without assigned variables or a for loop:
Code: PowerShell [Select]
  1. gci *.txt | % { sls $_.Name -Pattern '<dsf:tsdfgd trsdfge="urn:x-ssdfgs-dfg-com:isdfgc/tg4r3e-i4d" id="OsdfgsdfD">' -a | % { $_.Matches } | % { $_.Value } >> K:\out.txt }

That is very helpful thanks!

From what I have understood, the script will first scan its own folder where it exists, for all the txt files present and process them one by one in an array. Actually I think I can skip that bit if it can process the whole 25GB txt file at once.

As for the actual regex matches, what I would actually like it to do is to:
- scan the source file for a regex(A)
- finding the first instance of regex(A), it would store it in a variable and search another regex(B) inside that variable.
- then I have a couple more regex matches that I need it to store in that variable and output specific things from these regex matches inside the initial regex(A). By output I mean write sequencially line by line in an output file.
- then the loop will continue with the next regex(A) match inside the source file, and store it in a variable, and search for the same regex(B) etc matches inside that variable and output parts of those regex matches in the output file.

Sounds very basic and simple. Can you tell me what commands I need to write something like that please?

Thanks but I struggle to follow. I find AHK much more straight forward. But how can I make it work with a 25GB?

Can you explain please word by word this bit:

$items = Get-ChildItem -Path *.txt       # *.txt , *.foo , *.whatever
for ($i = 0; $i -lt $items.Count; $i++) {
  Select-String -Path $items[$i] -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } >> $outfile

Also, I need to append to the output file several regex matches/returns, how do I do that?
Also, if I specify a regex match, how do I specify what I want to be returned from this match?

Very interesting!

Do you know a good site that explains the structure of the script you posted and the definition/usage of the commands along with examples?

Also, does this script load the whole text of the file in memory to perform its operations? This will be a problem for a 25GB file

Could you tell me please in AHK? I am not familiar with that language, unless you can point me to the explanations of these commands?

Also, the file I want to manipulate is 25GB! Is there a strategy to handle this?

Thanks but I am not familiar with this language.

I think Autohotkey would be more appropriate for me as it has simple structure (what are the $i = 0; $i -lt $items.Count; $i++ they look like Aramaic to me :P)

Can you tell me the commands/structure in AHK to do this:

search for a regex1, extract regex2 from regex1 (ie append to a new file), and continuously loop until there is no other regex1 found

Also, how do I specify an exact string in regex? I want to specify the string <dsf:tsdfgd trsdfge="urn:x-ssdfgs-dfg-com:isdfgc/tg4r3e-i4d" id="OsdfgsdfD">
and I don't want to escape every single symbol etc.
Is there a way to search for an exact string literally?


Which tool/scripting language can I use to match REGEX strings and extract them to a new file or to just delete the non matching strings from multiple text files?

It would help to be easy to write as I cannot learn complicated syntax!

Also, ideally it should work via command line as I am talking about many many files which can be huge.


Thanks, as for now, is there an XML reader that can extract the hierarchy tree of a specific line?


You know that xml files are hierarchical, ie there is a top level attribute, then sub-attributes are indented and so on.

How can I copy the hierarchy, ie the upper level attributes of a selected attribute/line?

For example:


I want to select x2 and copy the hierarchy of it, ie the upper level attributes that x2 belongs to:

Any idea?



I have two lists with numbers. How can I find how many pairs there are between those two lists (ie similarity), where each pair is the number from second list that matches +/-5% to a number in the first list?


Living Room / Current technological and social challenges
« on: July 01, 2018, 11:39 AM »

Can you list please current technological or social challenges that need addressing?

I want to submit to an innovation competition, but I cannot pinpoint specific problems to focus on.


Living Room / Fix screen orientation
« on: June 22, 2018, 11:41 AM »

How can I disable or permanently fix my touchscreen rotation of my Lenovo Yoga 13 please?

There seems to be a hardware (or software) problem that makes my laptop start with disoriented screen.


Living Room / How to move forward with an idea?
« on: June 09, 2018, 05:07 PM »

I have a revolutionary idea for a medical device. I have researched its components and how it would work, although it is in very early stages as an idea.

How can I get help to develop it further and maybe commercialise it?

I don't have any money to invest, but I am willing to give percentage of my rights to have it researched further.

Any idea?


Living Room / Re: Looking for smartphone
« on: June 08, 2018, 07:07 AM »


Living Room / Basic banking issues
« on: May 15, 2018, 10:00 AM »

I have some banking interviews and I would like some resources. Unfortunately while there are tons of info on the web, I find them either too basic or too technical. Some forums or banking recruitment are hideous (probably people feel competitive and they don't support).

Can you tell me please where I can find comprehensive information about issues like:
1) What factors affect risk of mortgages?
2) What would you check before giving a loan to a corporation?
3) How would you value a corporation?


Living Room / Help with laptop
« on: May 11, 2018, 04:55 PM »
hello guys!

I have a serious problem with my Lenovo Yoga 13 (first model) and I would really be grateful if you can help me solve it so that I won't have to waste £500 for a new laptop.

Basically, it's a convertible with touchscreen, and now I that restarted it, the screen shows as an A4, ie the screen is rotated 90 degrees!

I am worried that it is damaged because it fell few days ago, but it didn't behave like that immediately after.

Any idea??
Can I just completely disable screen rotation or something?

Living Room / Re: Looking for smartphone
« on: May 05, 2018, 02:55 PM »
Any rival for Umidigi S2 PRO?

I will eventually have to ditch my Xiaomi Redmi Pro 3. After recommendations here I bought it, which proved to be good, but it has a major lack, the lack of NFC, so I cannot use it to make payments.

Living Room / Re: Any native english speakers?
« on: May 05, 2018, 02:53 PM »
It's a lack of reasoning game, based on interpreting the questions in the best way for you!  ;D

Then "I and II only" is also correct, (dependant on a better image).

I could probably come up with some esoteric branch of mathematics that also proved that III was correct.

Eventually, I think all three are correct.
The III is correct becase square root of 6 raised to the power of 3 makes six times square root of six

Living Room / Re: grab urls
« on: May 01, 2018, 05:04 PM »
It is strange that when I download JDownloader with IE or Chrome, Windows Defender intervenes and blocks the file.
When I download with Firefox, nothing happens!

Pages: prev1 2 3 4 [5] 6 7 8 9 10 ... 70next